Search Results

Search found 2822 results on 113 pages for 'scheduled backups'.

Page 106/113 | < Previous Page | 102 103 104 105 106 107 108 109 110 111 112 113  | Next Page >

  • How to best design a date/geographic proximity query on GAE?

    - by Dane
    Hi all, I'm building a directory for finding athletic tournaments on GAE with web2py and a Flex front end. The user selects a location, a radius, and a maximum date from a set of choices. I have a basic version of this query implemented, but it's inefficient and slow. One way I know I can improve it is by condensing the many individual queries I'm using to assemble the objects into bulk queries. I just learned that was possible. But I'm also thinking about a more extensive redesign that utilizes memcache. The main problem is that I can't query the datastore by location because GAE won't allow multiple numerical comparison statements (<,<=,=,) in one query. I'm already using one for date, and I'd need TWO to check both latitude and longitude, so it's a no go. Currently, my algorithm looks like this: 1.) Query by date and select 2.) Use destination function from geopy's distance module to find the max and min latitude and longitudes for supplied distance 3.) Loop through results and remove all with lat/lng outside max/min 4.) Loop through again and use distance function to check exact distance, because step 2 will include some areas outside the radius. Remove results outside supplied distance (is this 2/3/4 combination inefficent?) 5.) Assemble many-to-many lists and attach to objects (this is where I need to switch to bulk operations) 6.) Return to client Here's my plan for using memcache.. let me know if I'm way out in left field on this as I have no prior experience with memcache or server caching in general. -Keep a list in the cache filled with "geo objects" that represent all my data. These have five properties: latitude, longitude, event_id, event_type (in anticipation of expanding beyond tournaments), and start_date. This list will be sorted by date. -Also keep a dict of pointers in the cache which represent the start and end indices in the cache for all the date ranges my app uses (next week, 2 weeks, month, 3 months, 6 months, year, 2 years). -Have a scheduled task that updates the pointers daily at 12am. -Add new inserts to the cache as well as the datastore; update pointers. Using this design, the algorithm would now look like: 1.) Use pointers to slice off appropriate chunk of list based on supplied date. 2-4.) Same as above algorithm, except with geo objects 5.) Use bulk operation to select full tournaments using remaining geo objects' event_ids 6.) Assemble many-to-manys 7.) Return to client Thoughts on this approach? Many thanks for reading and any advice you can give. -Dane

    Read the article

  • Limit for Google calendar SMS notification per day

    - by pit777
    What is the limit for Google calendar SMS notification per day? How to detect I reached sms limit? Google write only this: http://www.google.com/support/calendar/bin/answer.py?hl=en&answer=36589 You might have reached the limit for SMS notifications. There is a limit to the number of SMS notifications you can receive each day. This limit shouldn't affect most users, but it's something to keep in mind if you've scheduled a large number of events and are no longer receiving SMS notifications. I created sms reminder based on google calendar api(apps-script), but I think now I reached the daily limit for SMS notifications... but google not informed what is exactly amount of sms limit restriction :/ function emailNotification() { // POWIADOMIENIA SMS O NOWEJ POCZCIE var calendarID = "[email protected]"; // id kalendarza o nazwie „sms4email” var gmailLabelTODO = "autoscriptslabels/sms"; // etykieta „AutoScriptsLabels/SMS” var gmailLabelDONE = "done/_sms"; // etykieta „DONE/_SMS” var calendarEventDescription = "this-is-sms_notification-mark"; // etykieta utworzonego zdarzenia, dzieki której mozna bedzie je znalezc podczas kasowania var calendar = CalendarApp.getCalendarById(calendarID); // otwieramy kalendarz var threads = GmailApp.getUserLabelByName(gmailLabelTODO).getThreads(); // zmienna przechowujaca kolekcje lancuszków z etykieta TODO var now = new Date(); if(threads == 0) return; // zaprzestanie wykonywania, jezeli brak nowych lancuszków for(i in threads) // tworzymy zdarzenia { var title = threads[i].getFirstMessageSubject(); // tytul emaila var startDate = new Date(now.getTime()+120000); var endDate = new Date(now.getTime()+120000); var messages = threads[i].getMessages(); var senderEmail = messages[0].getFrom(); // nadawca emaila var advancedArgs = {location:senderEmail, description:calendarEventDescription}; calendar.createEvent(title, startDate, endDate, advancedArgs); } Utilities.sleep(1000); GmailApp.getUserLabelByName(gmailLabelDONE).addToThreads(threads); // dodawanie etykiety DONE Utilities.sleep(1000); GmailApp.getUserLabelByName(gmailLabelTODO).removeFromThreads(threads); // usuwanie etykiety TODO Utilities.sleep(120000); // czekamy az kalendarz wysle SMS var TodaysEvents = calendar.getEventsForDay(now); for (i in TodaysEvents) // wyszukiwanie wedlug zdarzenia i kasowanie po wyslaniu { if (TodaysEvents[i].getDescription()==calendarEventDescription) TodaysEvents[i].deleteEvent(); } }

    Read the article

  • C# Confusing Results from Performance Test

    - by aip.cd.aish
    I am currently working on an image processing application. The application captures images from a webcam and then does some processing on it. The app needs to be real time responsive (ideally < 50ms to process each request). I have been doing some timing tests on the code I have and I found something very interesting (see below). clearLog(); log("Log cleared"); camera.QueryFrame(); camera.QueryFrame(); log("Camera buffer cleared"); Sensor s = t.val; log("Sx: " + S.X + " Sy: " + S.Y); Image<Bgr, Byte> cameraImage = camera.QueryFrame(); log("Camera output acuired for processing"); Each time the log is called the time since the beginning of the processing is displayed. Here is my log output: [3 ms]Log cleared [41 ms]Camera buffer cleared [41 ms]Sx: 589 Sy: 414 [112 ms]Camera output acuired for processing The timings are computed using a StopWatch from System.Diagonostics. QUESTION 1 I find this slightly interesting, since when the same method is called twice it executes in ~40ms and when it is called once the next time it took longer (~70ms). Assigning the value can't really be taking that long right? QUESTION 2 Also the timing for each step recorded above varies from time to time. The values for some steps are sometimes as low as 0ms and sometimes as high as 100ms. Though most of the numbers seem to be relatively consistent. I guess this may be because the CPU was used by some other process in the mean time? (If this is for some other reason, please let me know) Is there some way to ensure that when this function runs, it gets the highest priority? So that the speed test results will be consistently low (in terms of time). EDIT I change the code to remove the two blank query frames from above, so the code is now: clearLog(); log("Log cleared"); Sensor s = t.val; log("Sx: " + S.X + " Sy: " + S.Y); Image<Bgr, Byte> cameraImage = camera.QueryFrame(); log("Camera output acuired for processing"); The timing results are now: [2 ms]Log cleared [3 ms]Sx: 589 Sy: 414 [5 ms]Camera output acuired for processing The next steps now take longer (sometimes, the next step jumps to after 20-30ms, while the next step was previously almost instantaneous). I am guessing this is due to the CPU scheduling. Is there someway I can ensure the CPU does not get scheduled to do something else while it is running through this code?

    Read the article

  • Modeling distribution of performance measurements

    - by peterchen
    How would you mathematically model the distribution of repeated real life performance measurements - "Real life" meaning you are not just looping over the code in question, but it is just a short snippet within a large application running in a typical user scenario? My experience shows that you usually have a peak around the average execution time that can be modeled adequately with a Gaussian distribution. In addition, there's a "long tail" containing outliers - often with a multiple of the average time. (The behavior is understandable considering the factors contributing to first execution penalty). My goal is to model aggregate values that reasonably reflect this, and can be calculated from aggregate values (like for the Gaussian, calculate mu and sigma from N, sum of values and sum of squares). In other terms, number of repetitions is unlimited, but memory and calculation requirements should be minimized. A normal Gaussian distribution can't model the long tail appropriately and will have the average biased strongly even by a very small percentage of outliers. I am looking for ideas, especially if this has been attempted/analysed before. I've checked various distributions models, and I think I could work out something, but my statistics is rusty and I might end up with an overblown solution. Oh, a complete shrink-wrapped solution would be fine, too ;) Other aspects / ideas: Sometimes you get "two humps" distributions, which would be acceptable in my scenario with a single mu/sigma covering both, but ideally would be identified separately. Extrapolating this, another approach would be a "floating probability density calculation" that uses only a limited buffer and adjusts automatically to the range (due to the long tail, bins may not be spaced evenly) - haven't found anything, but with some assumptions about the distribution it should be possible in principle. Why (since it was asked) - For a complex process we need to make guarantees such as "only 0.1% of runs exceed a limit of 3 seconds, and the average processing time is 2.8 seconds". The performance of an isolated piece of code can be very different from a normal run-time environment involving varying levels of disk and network access, background services, scheduled events that occur within a day, etc. This can be solved trivially by accumulating all data. However, to accumulate this data in production, the data produced needs to be limited. For analysis of isolated pieces of code, a gaussian deviation plus first run penalty is ok. That doesn't work anymore for the distributions found above. [edit] I've already got very good answers (and finally - maybe - some time to work on this). I'm starting a bounty to look for more input / ideas.

    Read the article

  • Parallel.For System.OutOfMemoryException

    - by Martin Neal
    We have a fairly simple program that's used for creating backups. I'm attempting to parallelize it but am getting an OutofMemoryException within an AggregateExcption. Some of the source folders are quite large, and the program doesn't crash for about 40 minutes after it starts. I don't know where to start looking so the below code is a near exact dump of all code the code sans directory structure and Exception logging code. Any advise as to where to start looking? using System; using System.Diagnostics; using System.IO; using System.Threading.Tasks; namespace SelfBackup { class Program { static readonly string[] saSrc = { "\\src\\dir1\\", //... "\\src\\dirN\\", //this folder is over 6 GB }; static readonly string[] saDest = { "\\dest\\dir1\\", //... "\\dest\\dirN\\", }; static void Main(string[] args) { Parallel.For(0, saDest.Length, i => { try { if (Directory.Exists(sDest)) { //Delete directory first so old stuff gets cleaned up Directory.Delete(sDest, true); } //recursive function clsCopyDirectory.copyDirectory(saSrc[i], sDest); } catch (Exception e) { //standard error logging CL.EmailError(); } }); } } /////////////////////////////////////// using System.IO; using System.Threading.Tasks; namespace SelfBackup { static class clsCopyDirectory { static public void copyDirectory(string Src, string Dst) { Directory.CreateDirectory(Dst); /* Copy all the files in the folder If and when .NET 4.0 is installed, change Directory.GetFiles to Directory.Enumerate files for slightly better performance.*/ Parallel.ForEach<string>(Directory.GetFiles(Src), file => { /* An exception thrown here may be arbitrarily deep into this recursive function there's also a good chance that if one copy fails here, so too will other files in the same directory, so we don't want to spam out hundreds of error e-mails but we don't want to abort all together. Instead, the best solution is probably to throw back up to the original caller of copy directory an move on to the next Src/Dst pair by not catching any possible exception here.*/ File.Copy(file, //src Path.Combine(Dst, Path.GetFileName(file)), //dest true);//bool overwrite }); //Call this function again for every directory in the folder. Parallel.ForEach(Directory.GetDirectories(Src), dir => { copyDirectory(dir, Path.Combine(Dst, Path.GetFileName(dir))); }); } }

    Read the article

  • Flex/Flash 4 datagrid displays raw xml

    - by Setori
    Problem: Flex/Flash4 client (built with FlashBuilder4) displays the xml sent from the server exactly as is - the datagrid keeps the format of the xml. I need the datagrid to parse the input and place the data in the correct rows and columns of the datagrid. flow: click on a date in the tree and it makes a server request for batch information in xml form. Using a CallResponder I then update the datagrid's dataProvider. [code] <fx:Script> <![CDATA[ import mx.controls.Alert; [Bindable]public var selectedTreeNode:XML; public function taskTreeChanged(event:Event):void { selectedTreeNode=Tree(event.target).selectedItem as XML; var searchHubId:String = selectedTreeNode.@hub; var searchDate:String = selectedTreeNode.@lbl; if((searchHubId == "") || (searchDate == "")){ return; } findShipmentBatches(searchDate,searchHubId); } protected function findShipmentBatches(searchDate:String, searchHubId:String):void{ findShipmentBatchesResult.token = actWs.findShipmentBatches(searchDate, searchHubId); } protected function updateBatchDataGridDP():void{ task_list_dg.dataProvider = findShipmentBatchesResult.lastResult; } ]]> </fx:Script> <fx:Declarations> <actws:ActWs id="actWs" fault="Alert.show(event.fault.faultString + '\n' + event.fault.faultDetail)" showBusyCursor="true"/> <s:CallResponder id="findShipmentBatchesResult" result="updateBatchDataGridDP()"/> </fx:Declarations> <mx:AdvancedDataGrid id="task_list_dg" width="100%" height="95%" paddingLeft="0" paddingTop="0" paddingBottom="0"> <mx:columns> <mx:AdvancedDataGridColumn headerText="Receiving date" dataField="rd"/> <mx:AdvancedDataGridColumn headerText="Msg type" dataField="mt"/> <mx:AdvancedDataGridColumn headerText="SSD" dataField="ssd"/> <mx:AdvancedDataGridColumn headerText="Shipping site" dataField="sss"/> <mx:AdvancedDataGridColumn headerText="File name" dataField="fn"/> <mx:AdvancedDataGridColumn headerText="Batch number" dataField="bn"/> </mx:columns> </mx:AdvancedDataGrid> //xml example from server <batches> <batch> <rd>2010-04-23 16:31:00.0</rd> <mt>SC1REVISION01</mt> <ssd>2010-02-18 00:00:00.0</ssd> <sss>100000009</sss> <fn>Revision 1-DF-Ocean-SC1SUM-Quanta-PACT-EMEA-Scheduled Ship Date 20100218.csv</fn> <bn>10041</bn> </batch> <batches> [/code] and the xml is pretty much displayed exactly as is shown in the example above in the datagrid columns... I would appreciate your assistance.

    Read the article

  • Why does PostgresQL query performance drop over time, but restored when rebuilding index

    - by Jim Rush
    According to this page in the manual, indexes don't need to be maintained. However, we are running with a PostgresQL table that has a continuous rate of updates, deletes and inserts that over time (a few days) sees a significant query degradation. If we delete and recreate the index, query performance is restored. We are using out of the box settings. The table in our test is currently starting out empty and grows to half a million rows. It has a fairly large row (lots of text fields). We are search is based of an index, not the primary key (I've confirmed the index is being used, at least under normal conditions) The table is being used as a persistent store for a single process. Using PostgresQL on Windows with a Java client I'm willing to give up insert and update performance to keep up the query performance. We are considering rearchitecting the application so that data is spread across various dynamic tables in a manner that allows us to drop and rebuild indexes periodically without impacting the application. However, as always, there is a time crunch to get this to work and I suspect we are missing something basic in our configuration or usage. We have considered forcing vacuuming and rebuild to run at certain times, but I suspect the locking period for such an action would cause our query to block. This may be an option, but there are some real-time (windows of 3-5 seconds) implications that require other changes in our code. Additional information: Table and index CREATE TABLE icl_contacts ( id bigint NOT NULL, campaignfqname character varying(255) NOT NULL, currentstate character(16) NOT NULL, xmlscheduledtime character(23) NOT NULL, ... 25 or so other fields. Most of them fixed or varying character fiel ... CONSTRAINT icl_contacts_pkey PRIMARY KEY (id) ) WITH (OIDS=FALSE); ALTER TABLE icl_contacts OWNER TO postgres; CREATE INDEX icl_contacts_idx ON icl_contacts USING btree (xmlscheduledtime, currentstate, campaignfqname); Analyze: Limit (cost=0.00..3792.10 rows=750 width=32) (actual time=48.922..59.601 rows=750 loops=1) - Index Scan using icl_contacts_idx on icl_contacts (cost=0.00..934580.47 rows=184841 width=32) (actual time=48.909..55.961 rows=750 loops=1) Index Cond: ((xmlscheduledtime < '2010-05-20T13:00:00.000'::bpchar) AND (currentstate = 'SCHEDULED'::bpchar) AND ((campaignfqname)::text = '.main.ee45692a-6113-43cb-9257-7b6bf65f0c3e'::text)) And, yes, I am aware there there are a variety of things we could do to normalize and improve the design of this table. Some of these options may be available to us. My focus in this question is about understanding how PostgresQL is managing the index and query over time (understand why, not just fix). If it were to be done over or significantly refactored, there would be a lot of changes.

    Read the article

  • Cases of companies taking IP rights of your own personal projects developed outside company time

    - by GSS
    Hi, I have heard of cases where a developer working for a company is also making his own personal projects in his own time, using his own equipment yet the company he works for tries to claim ownership for the project. I really find this annoying, and bang out of order. It should also be illegal. I am in this position (work for a company and working on my own systems - from small class libraries used to practise what I learn in my exam revision to a large commercial-scale system). While I don't know if the company will try to take ownership, all I know is they say they do not want a conflict of interest. Fair enough, my system is developed in my own time using my own equipment. They also say that work time should be for work only, which it is. Funny thing that as work is so boring, easy and slow that I have plenty of free time, which I wish I could spend on something productive - said system. The problem is, my company does not take hiring technical talent seriously. This is my first job, I am a junior coder (but my status/position doesn't really reflect what I can do), but I am the only developer. Likewise with the guy who controls Windows Server. As the contract does not say anything about taking ownership, I would assume they would. They would try to milk my success (I've made a good impression so I am sure they would). How can this be allowed? Are there any examples of this happening to any fellow Stacker here? It really makes my blood boil. What I find funny is that my company hardly has the expertise and resources to even be able to successfully run a project of my size. What I do at work is an ASP.NET application consisting of five pages, and even then there are flaws in the project. If I told them that they would also have to take responsibility for flaws in the project, then they would think twice! It's exactly because of this I save the best code for myself and at work I write rubbish code full of code smells. The company don't really care about error handling, as long as the business functionality works (ie a scheduled email sends, but there is no error handling). They'd think twice when they see the embarassment and business cost of a YSOD...

    Read the article

  • Flex/Flash 4 datagrid literally displays XML

    - by Setori
    Problem: Flex/Flash4 client (built with FlashBuilder4) displays the xml sent from the server exactly as is - the datagrid keeps the format of the xml. I need the datagrid to parse the input and place the data in the correct rows and columns of the datagrid. flow: click on a date in the tree and it makes a server request for batch information in xml form. Using a CallResponder I then update the datagrid's dataProvider. [code] <fx:Script> <![CDATA[ import mx.controls.Alert; [Bindable]public var selectedTreeNode:XML; public function taskTreeChanged(event:Event):void { selectedTreeNode=Tree(event.target).selectedItem as XML; var searchHubId:String = selectedTreeNode.@hub; var searchDate:String = selectedTreeNode.@lbl; if((searchHubId == "") || (searchDate == "")){ return; } findShipmentBatches(searchDate,searchHubId); } protected function findShipmentBatches(searchDate:String, searchHubId:String):void{ findShipmentBatchesResult.token = actWs.findShipmentBatches(searchDate, searchHubId); } protected function updateBatchDataGridDP():void{ task_list_dg.dataProvider = findShipmentBatchesResult.lastResult; } ]]> </fx:Script> <fx:Declarations> <actws:ActWs id="actWs" fault="Alert.show(event.fault.faultString + '\n' + event.fault.faultDetail)" showBusyCursor="true"/> <s:CallResponder id="findShipmentBatchesResult" result="updateBatchDataGridDP()"/> </fx:Declarations> <mx:AdvancedDataGrid id="task_list_dg" width="100%" height="95%" paddingLeft="0" paddingTop="0" paddingBottom="0"> <mx:columns> <mx:AdvancedDataGridColumn headerText="Receiving date" dataField="rd"/> <mx:AdvancedDataGridColumn headerText="Msg type" dataField="mt"/> <mx:AdvancedDataGridColumn headerText="SSD" dataField="ssd"/> <mx:AdvancedDataGridColumn headerText="Shipping site" dataField="sss"/> <mx:AdvancedDataGridColumn headerText="File name" dataField="fn"/> <mx:AdvancedDataGridColumn headerText="Batch number" dataField="bn"/> </mx:columns> </mx:AdvancedDataGrid> [/code] I cannot upload a pic, but this is the xml: [code] 2010-04-23 16:35:51.0 PRESHIP 2010-02-15 00:00:00.0 100000009 DF-Ocean-PRESHIPSUM-Quanta-PACT-EMEA-Scheduled Ship Date 20100215.csv 10053 [/code] and the xml is pretty much displayed exactly as is in the datagrid columns... I would appreciate your assistance.

    Read the article

  • quartz: preventing concurrent instances of a job in jobs.xml

    - by Jason S
    This should be really easy. I'm using Quartz running under Apache Tomcat 6.0.18, and I have a jobs.xml file which sets up my scheduled job that runs every minute. What I would like to do, is if the job is still running when the next trigger time rolls around, I don't want to start a new job, so I can let the old instance complete. Is there a way to specify this in jobs.xml (prevent concurrent instances)? If not, is there a way I can share access to an in-memory singleton within my application's Job implementation (is this through the JobExecutionContext?) so I can handle the concurrency myself? (and detect if a previous instance is running) update: After floundering around in the docs, here's a couple of approaches I am considering, but either don't know how to get them to work, or there are problems. Use StatefulJob. This prevents concurrent access... but I'm not sure what other side-effects would occur if I use it, also I want to avoid the following situation: Suppose trigger times would be every minute, i.e. trigger#0 = at time 0, trigger #1 = 60000msec, #2 = 120000, #3 = 180000, etc. and the trigger#0 at time 0 fires my job which takes 130000msec. With a plain Job, this would execute triggers #1 and #2 while job trigger #0 is still running. With a StatefulJob, this would execute triggers #1 and #2 in order, immediately after #0 finishes at 130000. I don't want that, I want #1 and #2 not to run and the next trigger that runs a job should take place at #3 (180000msec). So I still have to do something else with StatefulJob to get it to work the way I want, so I don't see much of an advantage to using it. Use a TriggerListener to return true from vetoJobExecution(). Although implementing the interface seems straightforward, I have to figure out how to setup one instance of a TriggerListener declaratively. Can't find the docs for the xml file. Use a static shared thread-safe object (e.g. a semaphore or whatever) owned by my class that implements Job. I don't like the idea of using singletons via the static keyword under Tomcat/Quartz, not sure if there are side effects. Also I really don't want them to be true singletons, just something that is associated with a particular job definition. Implement my own Trigger which extends SimpleTrigger and contains shared state that could run its own TriggerListener. Again, I don't know how to setup the XML file to use this trigger rather than the standard <trigger><simple>...</simple></trigger>.

    Read the article

  • Default Database Collations PenTesting Env

    - by dominicdinada
    I am using Ubuntu 9.10 with XAMPP ( Lampp "MYSQL 5.1.45 PHPMYADMIN 3.3.1 PHP 5.3.2 ) What my problem is, is that I set up my testing env to debug my scripts locally and when I did so there arose a problem. This problem is that I used firefox's addon SQLinject ME to test for weakness' and upon doing so it caused mysql to change the default local collations; character sets dir /opt/lampp/share/mysql/charsets/ collation connection latin1_general_ci (Global value) latin1_swedish_ci collation database latin1_swedish_ci collation server latin1_swedish_ci I have searched for quite sometime in regards to a solution to this problem and have come up with searching for the db.opt file which stores this information without success. Upon not finding a solution I removed lampp with the "sudo rm -fR /opt" command and reinstall and the problem still persists. I have tried to change the collations manually and still come up with the database displaying latin1_swedish_ci as the default language. Why is this a problem?? Why is it a problem with mysql? Because the application I am testing and debugging locally is built on the CodeIgnitor with Smarty framework and since this combination of framework is built to detect the LOCALES, Rather what the database defaults are I keep getting errors saying no language file for swedish...... Of course I could get the swedish language file to work around this problem but I do not feel the need to make this work around a perminant solution as with time when I move on to projects I will run into simular problems every time that A; When importing database files, backups etc it will default to import such databases as the locale swedish. B; As time passes on I might completly forget of this error and will be back to square one. I have found this code in searches for a fix,which seems to alter the tables to a desired Collaion; $value) { mysql_query("ALTER TABLE $value COLLATE latin1_general_ci"); }} echo "The collation of your database has been successfully changed!"; ? Which is handy to switch collations in One Schema at a time however this is not a fix when a framework doesnt care that the said database is in one langugae. It tests for the Default of the entire server. Someone with any knowledge of a purge or fix to this I would greatly appricate the help. One more final note is that when I was testing I only figured to back up the applications DataBase and not the entire Schema of the install. No matter if I uninstall or reinstall the database still seems to carry these problems.

    Read the article

  • Recursive N-way merge/diff algorithm for directory trees?

    - by BobMcGee
    What algorithms or Java libraries are available to do N-way, recursive diff/merge of directories? I need to be able to generate a list of folder trees that have many identical files, and have subdirectories with many similar files. I want to be able to use 2-way merge operations to quickly remove as much redundancy as possible. Goals: Find pairs of directories that have many similar files between them. Generate short list of directory pairs that can be synchronized with 2-way merge to eliminate duplicates Should operate recursively (there may be nested duplicates of higher-level directories) Run time and storage should be O(n log n) in numbers of directories and files Should be able to use an embedded DB or page to disk for processing more files than fit in memory (100,000+). Optional: generate an ancestry and change-set between folders Optional: sort the merge operations by how many duplicates they can elliminate I know how to use hashes to find duplicate files in roughly O(n) space, but I'm at a loss for how to go from this to finding partially overlapping sets between folders and their children. EDIT: some clarification The tricky part is the difference between "exact same" contents (otherwise hashing file hashes would work) and "similar" (which will not). Basically, I want to feed this algorithm at a set of directories and have it return a set of 2-way merge operations I can perform in order to reduce duplicates as much as possible with as few conflicts possible. It's effectively constructing an ancestry tree showing which folders are derived from each other. The end goal is to let me incorporate a bunch of different folders into one common tree. For example, I may have a folder holding programming projects, and then copy some of its contents to another computer to work on it. Then I might back up and intermediate version to flash drive. Except I may have 8 or 10 different versions, with slightly different organizational structures or folder names. I need to be able to merge them one step at a time, so I can chose how to incorporate changes at each step of the way. This is actually more or less what I intend to do with my utility (bring together a bunch of scattered backups from different points in time). I figure if I can do it right I may as well release it as a small open source util. I think the same tricks might be useful for comparing XML trees though.

    Read the article

  • dynamiclly schedule a lead sales agent

    - by Josh
    I have a website that I'm trying to migrate from classic asp to asp.net. It had a lead schedule, where each sales agent would be featured for the current day, or part of the day.The next day a new agent would be scheduled. It was driven off a database table that had a row for each day in it. So to figure out if a sales agent would show on a day, it was easy, just find today's date in the table. Problem was it ran out rows, and you had to run a script to update the lead days 6 months at a time. Plus if there was ever any change to the schedule, you had to delete all the rows and re-run the script. So I'm trying to code it where sql server figures that out for me, and no script has to be ran. I have a table like so CREATE TABLE [dbo].[LeadSchedule]( [leadid] [int] IDENTITY(1,1) NOT NULL, [userid] [int] NOT NULL, [sunday] [bit] NOT NULL, [monday] [bit] NOT NULL, [tuesday] [bit] NOT NULL, [wednesday] [bit] NOT NULL, [thursday] [bit] NOT NULL, [friday] [bit] NOT NULL, [saturday] [bit] NOT NULL, [StartDate] [smalldatetime] NULL, [EndDate] [smalldatetime] NULL, [StartTime] [time](0) NULL, [EndTime] [time](0) NULL, [order] [int] NULL, So the user can schedule a sales agent depending on their work schedule. Also if they wanted to they could split certain days, or sales agents by time, So from Midnight to 4 it was one agent, from 4-midnight it was another. So far I've tried using a numbers table, row numbers, goofy date math, and I'm at a loss. Any suggestions on how to handle this purely from sql code? If it helps, the table should always be small, like less than 20 never over 100. update After a few hours all I've managed to come up with is the below. It doesn't handle filling in days not available or times, just rotates through all the sales agents with leadTable as ( select leadid,userid,[order],StartDate, case DATEPART(dw,getdate()) when 1 then sunday when 2 then monday when 3 then tuesday when 4 then wednesday when 5 then thursday when 6 then friday when 7 then saturday end as DayAvailable , ROW_NUMBER() OVER (ORDER BY [order] ASC) AS ROWID from LeadSchedule where GETDATE()>=StartDate and (CONVERT(time(0),GETDATE())>= StartTime or StartTime is null) and (CONVERT(time(0),GETDATE())<= EndTime or EndTime is null) ) select userid, DATEADD(d,(number+ROWID-2)*totalUsers,startdate ) leadday from (select *, (select COUNT(1) from leadTable) totalUsers from leadTable inner join Numbers on 1=1 where DayAvailable =1 ) tb1 order by leadday asc

    Read the article

  • Designing a database file format

    - by RoliSoft
    I would like to design my own database engine for educational purposes, for the time being. Designing a binary file format is not hard nor the question, I've done it in the past, but while designing a database file format, I have come across a very important question: How to handle the deletion of an item? So far, I've thought of the following two options: Each item will have a "deleted" bit which is set to 1 upon deletion. Pro: relatively fast. Con: potentially sensitive data will remain in the file. 0x00 out the whole item upon deletion. Pro: potentially sensitive data will be removed from the file. Con: relatively slow. Recreating the whole database. Pro: no empty blocks which makes the follow-up question void. Con: it's a really good idea to overwrite the whole 4 GB database file because a user corrected a typo. I will sell this method to Twitter ASAP! Now let's say you already have a few empty blocks in your database (deleted items). The follow-up question is how to handle the insertion of a new item? Append the item to the end of the file. Pro: fastest possible. Con: file will get huge because of all the empty blocks that remain because deleted items aren't actually deleted. Search for an empty block exactly the size of the one you're inserting. Pro: may get rid of some blocks. Con: you may end up scanning the whole file at each insert only to find out it's very unlikely to come across a perfectly fitting empty block. Find the first empty block which is equal or larger than the item you're inserting. Pro: you probably won't end up scanning the whole file, as you will find an empty block somewhere mid-way; this will keep the file size relatively low. Con: there will still be lots of leftover 0x00 bytes at the end of items which were inserted into bigger empty blocks than they are. Rigth now, I think the first deletion method and the last insertion method are probably the "best" mix, but they would still have their own small issues. Alternatively, the first insertion method and scheduled full database recreation. (Probably not a good idea when working with really large databases. Also, each small update in that method will clone the whole item to the end of the file, thus accelerating file growth at a potentially insane rate.) Unless there is a way of deleting/inserting blocks from/to the middle of the file in a file-system approved way, what's the best way to do this? More importantly, how do databases currently used in production usually handle this?

    Read the article

  • Is there a scheduling algorithm that optimizes for "maker's schedules"?

    - by John Feminella
    You may be familiar with Paul Graham's essay, "Maker's Schedule, Manager's Schedule". The crux of the essay is that for creative and technical professionals, meetings are anathema to productivity, because they tend to lead to "schedule fragmentation", breaking up free time into chunks that are too small to acquire the focus needed to solve difficult problems. In my firm we've seen significant benefits by minimizing the amount of disruption caused, but the brute-force algorithm we use to decide schedules is not sophisticated enough to handle scheduling large groups of people well. (*) What I'm looking for is if there's are any well-known algorithms which minimize this productivity disruption, among a group of N makers and managers. In our model, There are N people. Each person pi is either a maker (Mk) or a manager (Mg). Each person has a schedule si. Everyone's schedule is H hours long. A schedule consists of a series of non-overlapping intervals si = [h1, ..., hj]. An interval is either free or busy. Two adjacent free intervals are equivalent to a single free interval that spans both. A maker's productivity is maximized when the number of free intervals is minimized. A manager's productivity is maximized when the total length of free intervals is maximized. Notice that if there are no meetings, both the makers and the managers experience optimum productivity. If meetings must be scheduled, then makers prefer that meetings happen back-to-back, while managers don't care where the meeting goes. Note that because all disruptions are treated as equally harmful to makers, there's no difference between a meeting that lasts 1 second and a meeting that lasts 3 hours if it segments the available free time. The problem is to decide how to schedule M different meetings involving arbitrary numbers of the N people, where each person in a given meeting must place a busy interval into their schedule such that it doesn't overlap with any other busy interval. For each meeting Mt the start time for the busy interval must be the same for all parties. Does an algorithm exist to solve this problem or one similar to it? My first thought was that this looks really similar to defragmentation (minimize number of distinct chunks), and there are a lot of algorithms about that. But defragmentation doesn't have much to do with scheduling. Thoughts? (*) Practically speaking this is not really a problem, because it's rare that we have meetings with more than ~5 people at once, so the space of possibilities is small.

    Read the article

  • WLI domain with 3 servers - issues on JPD process startup

    - by XpiritO
    Hi there. I'm currently working on a clustered WLI environment which comprehends 3 servers: 1 admin server ("AdminServer") and 2 managed servers ("mn1" and "mn2") grouped as a cluster, as follows: Architecture diagram: http://img72.imageshack.us/img72/4112/clusterdiagram.jpg I've developed a JPD process to execute some scheduled tasks, invoked using a Message Broker. I've deployed this project into a single-server WLI domain (with AdminServer only) and it works as expected: the JPD process is invoked (I've configured a Timer Event Generator instance to start it up). Message broker: http://img532.imageshack.us/img532/1443/wlimessagebroker.jpg Timer event generator: http://img408.imageshack.us/img408/7358/wlitimereventgenerator.jpg In order to achieve fail-over and load-balancing capabilities, I'm currently trying to deploy this JPD process into this clustered WLI environment. Although, I'm having some issues with this, as I cannot get it to work properly, even if it still works. Here is a screenshot of the "WLI Process Instance Monitor" (with AdminServer and mn1 instances up and running): http://img710.imageshack.us/img710/8477/wliprocessinstancemonit.jpg According to this screen the process seems to be running, as it shows in this instance monitor screen. However, I don't see any output coming out neither at AdminServer console or mn1 console. In single-server domain it was visible output from JPD process "timeout" callback method, wich implementation is shown below: @com.bea.wli.control.broker.MessageBroker.StaticSubscription(xquery = "", filterValueMatch = "", channelName = "/SamplePrefix/Samples/SampleStringChannel", messageBody = "{x0}") public void subscription(java.lang.String x0) { String toReturn=""; try { Context myCtx = new InitialContext(); MBeanHome mbeanHome = (MBeanHome)myCtx.lookup("weblogic.management.home.localhome"); toReturn=mbeanHome.getMBeanServer().getServerName(); System.out.println("**** executed at **** " + System.currentTimeMillis() + " by: " + toReturn); } catch (Exception e) { System.out.println("Exception!"); e.printStackTrace(); } } (...) @org.apache.beehive.controls.api.events.EventHandler(field = "myT", eventSet = com.bea.control.WliTimerControl.Callback.class, eventName = "onTimeout") public void myT_onTimeout(long time, java.io.Serializable data) { // #START: CODE GENERATED - PROTECTED SECTION - you can safely add code above this comment in this method. #// // input transform System.out.println("**** published at **** " + System.currentTimeMillis()); publishControl.publish("aaaa"); // parameter assignment // #END : CODE GENERATED - PROTECTED SECTION - you can safely add code below this comment in this method. #// } and here is the output visible at "AdminServer" console in single-server domain testing: **** published at **** 1273238090713 **** executed at **** 1273238132123 by: AdminServer **** published at **** 1273238152462 **** executed at **** 1273238152562 by: AdminServer (...) What may be wrong with my clustered configuration? Am I missing something to accomplish clustered deployment? Thanks in advance for your help.

    Read the article

  • SQL/Schema comparison and upgrade

    - by Workshop Alex
    I have a simple situation. A large organisation is using several different versions of some (desktop) application and each version has it's own database structure. There are about 200 offices and each office will have it's own version, which can be one of 7 different ones. The company wants to upgrade all applications to the latest versions, which will be version 8. The problem is that they don't have a separate database for each version. Nor do they have a separate database for each office. They have one single database which is handled by a dedicated server, thus keeping things like management and backups easier. Every office has it's own database schema and within the schema there's the whole database structure for their specific application version. As a result, I'm dealing with 200 different schema's which need to be upgraded, each with 7 possible versions. Fortunately, every schema knows the proper version so checking the version isn't difficult. But my problem is that I need to create upgrade scripts which can upgrade from version 1 to version 2 to version 3 to etc... Basically, all schema's need to be bumped up one version until they're all version 8. Writing the code that will do this is no problem. the challenge is how to create the upgrade script from one version to the other? Preferably with some automated tool. I've examined RedGate's SQL Compare and Altova's DatabaseSpy but they're not practical. Altova is way too slow. RedGate requires too much processing afterwards, since the generated SQL Script still has a few errors and it refers to the schema name. Furthermore, the code needs to become part of a stored procedure and the code generated by RedGate doesn't really fit inside a single procedure. (Plus, it's doing too much transaction-handling, while I need everything within a single transaction. I have been considering using another SQL Comparison tool but it seems to me that my case is just too different from what standard tools can deliver. So I'm going to write my own comparison tool. To do this, I'll be using ADOX with Delphi to read the catalogues for every schema version in the database, then use this to write the SQL Statements that will need to upgrade these schema's to their next version. (Comparing 1 with 2, 2 with 3, 3 with 4, etc.) I'm not unfamiliar with generating SQL-Script-Generators so I don't expect too many problems. And I'll only be upgrading the table structures, not any of the other database objects. So, does anyone have some good tips and tricks to apply when doing this kind of comparisons? Things to be aware of? Practical tips to increase speed?

    Read the article

  • Doing some stuff right before the user exits the page

    - by Mike
    I have seen some questions here regarding what I want to achieve and have based what I have so far on those answer. But there is a slight misbehavior that is still irritating me. What I have is sort of a recovery feature. Whenever you are typing text, the client sends a sync request to the server every 45 seconds. It does 2 things. First, it extends the lease the client has on the record (only one person may edit at one time) for another 60 seconds. Second, it sends the text typed so far to the server in case the server crashes, internet connection fails, etc. In that case, the next time the user enters our application, the user is notified that something has gone wrong and that some text was recovered. Think of Microsoft or OpenOffice recovery whenever they crash! Of course, if the user leaves the page willingly, the user does not need to be notified and as a result, the recovery is deleted. I do that final request via a beforeunload event. Everything went fine until I was asked to make a final adjustment... The same behavior you have here at stack overflow when you exit the editor... a confirm dialogue. This works so far, BUT, the confirm dialogue is shown twice. Here is the code. The event if (local.sync.autosave_textelement) { window.onbeforeunload = exitConfirm; } The function function exitConfirm() { var local = Core; if (confirm('blub?')) { local.sync.autosave_destroy = true; sync(false); return true; } else { return false; } }; Some problem irrelevant clarifications: Core is a global Object that contains a lot of variables that are used everywhere. sync makes an ajax request. The values are based on the values that the Core.sync object contains. The parameter determines if the call should be async (default) or sync. Edit 1 I did try to separate both things (recovery deletion and user confirmation that is) into beforeunload and unload. The problem there was that unload is a bit too late. The user gets informed that there is a recovery even though it is scheduled to be deleted. If you refresh the page 1 second later, the dialogue disappears as the file was deleted by then.

    Read the article

  • Loading datasets from datastore and merge into single dictionary. Resource problem.

    - by fredrik
    Hi, I have a productdatabase that contains products, parts and labels for each part based on langcodes. The problem I'm having and haven't got around is a huge amount of resource used to get the different datasets and merging them into a dict to suit my needs. The products in the database are based on a number of parts that is of a certain type (ie. color, size). And each part has a label for each language. I created 4 different models for this. Products, ProductParts, ProductPartTypes and ProductPartLabels. I've narrowed it down to about 10 lines of code that seams to generate the problem. As of currently I have 3 Products, 3 Types, 3 parts for each type, and 2 languages. And the request takes a wooping 5500ms to generate. for product in productData: productDict = {} typeDict = {} productDict['productName'] = product.name cache_key = 'productparts_%s' % (slugify(product.key())) partData = memcache.get(cache_key) if not partData: for type in typeData: typeDict[type.typeId] = { 'default' : '', 'optional' : [] } ## Start of problem lines ## for defaultPart in product.defaultPartsData: for label in labelsForLangCode: if label.key() in defaultPart.partLabelList: typeDict[defaultPart.type.typeId]['default'] = label.partLangLabel for optionalPart in product.optionalPartsData: for label in labelsForLangCode: if label.key() in optionalPart.partLabelList: typeDict[optionalPart.type.typeId]['optional'].append(label.partLangLabel) ## end problem lines ## memcache.add(cache_key, typeDict, 500) partData = memcache.get(cache_key) productDict['parts'] = partData productList.append(productDict) I guess the problem lies in the number of for loops is too many and have to iterate over the same data over and over again. labelForLangCode get all labels from ProductPartLabels that match the current langCode. All parts for a product is stored in a db.ListProperty(db.key). The same goes for all labels for a part. The reason I need the some what complex dict is that I want to display all data for a product with it's default parts and show a selector for the optional one. The defaultPartsData and optionaPartsData are properties in the Product Model that looks like this: @property def defaultPartsData(self): return ProductParts.gql('WHERE __key__ IN :key', key = self.defaultParts) @property def optionalPartsData(self): return ProductParts.gql('WHERE __key__ IN :key', key = self.optionalParts) When the completed dict is in the memcache it works smoothly, but isn't the memcache reset if the application goes in to hibernation? Also I would like to show the page for first time user(memcache empty) with out the enormous delay. Also as I said above, this is only a small amount of parts/product. What will the result be when it's 30 products with 100 parts. Is one solution to create a scheduled task to cache it in the memcache every hour? It this efficient? I know this is alot to take in, but I'm stuck. I've been at this for about 12 hours straight. And can't figure out a solution. ..fredrik

    Read the article

  • AnimationDrawable, when does it end?

    - by Syb
    I know there have been several people with the same question. Which is: How do i know when a frame by frame animation has ended? I have not had any useful answer on fora i visited. So i thought, let's see if they know at stackoverflow. But I could not sit still in the mean time, so i made a work around of this, but it does not really work the way i would like it to. here is the code: public class Main extends Activity { AnimationDrawable sybAnimation; /** Called when the activity is first created. */ @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); ImageView imageView = (ImageView)findViewById(R.id.ImageView01); imageView.setBackgroundResource(R.anim.testanimation); sybAnimation = (AnimationDrawable) imageView.getBackground(); imageView.post(new Starter()); } class Starter implements Runnable { public void run() { sybAnimation.start(); long totalDuration = 0; for(int i = 0; i< sybAnimation.getNumberOfFrames();i++){ totalDuration += sybAnimation.getDuration(i); } Timer timer = new Timer(); timer.schedule(new AnimationFollowUpTimerTask(R.id.ImageView01, R.anim.testanimation_reverse),totalDuration); } } class AnimationFollowUpTimerTask extends TimerTask { private int id; private int animationToRunId; public AnimationFollowUpTimerTask(int idOfImageView, int animationXML){ id = idOfImageView; animationToRunId = animationXML; } @Override public void run() { ImageView imageView = (ImageView)findViewById(id); imageView.setBackgroundResource(animationToRunId); AnimationDrawable anim = (AnimationDrawable) imageView.getBackground(); anim.start(); } } basically I make a timertask which is scheduled with the same time as the animation to take. In that run() I want to load a new animation into the imageView and start that animation, this however does not work. Does anyone know how to get this to work, or even better, have a better way to find out when an AnimationDrawable has ended its animation?

    Read the article

  • Difference between Cloud and Virtualization

    - by Akash Kava
    Ops: This does not belong to ServerFault because it focuses on Programing Architecture. I have following questions regarding differences between Cloud and Virtualization.. How Cloud is different then Virtualization? Currently I tried to find out pricing of Rackspace, Amazone and all similar cloud providers, I found that our current 6 dedicated servers came cheaper then their pricing. So how one can claim cloud is cheaper? Is it cheaper only in comparison of normal hosting? We re organized our infrastructure in virtual environment to reduce or configuration overhead at time of failure, we did not have to rewrite any peice of code that is already written for earlier setup. So moving to virtualization does not require any re programming. But cloud is absoltely different and it will require entire reprogramming right? Is it really worth to recode when our current IT costs are 3-4 times lower then cloud hosting including raid backups and all sort of clustering for high availability? New programming architecture means new overheads of training staff, new methods of testing and new deployment schemes, does it justify over "on demand resource usage" words of cloud? We are having current development architecture with simple Server side ASP.NET WebServices with no local context and on client side Flex/Silverlight which offers pretty good REST architecture and its highly scalable. How does cloud differs from REST model of deployment? On storage, SQL Server or MySQL offers pretty good replication and high availibility then what is advantage in cloud? Data guarantee, one of our vendor hosting some other customer's app on cloud (one of most used), lost Entire Hard Disk (the virtual) and entire module in first 6 months. Second provider said its your duty to take backup, fine I agree, but no provider gives SLA for data guarantee, they give 99% uptime. However in most business apps, uptime is less important then data integrity. In our 10 years of dedicated hosting experience we had only one hard disk crash. This makes me little skeptical to go for cloud and loosing control over data. And I feel its just a big marketing buzz to sell virtulization in different form. Size of data, currently all providers charge very heavy for large data, if you are hosting only below 100GB cloud can be good alternative, but I think virtual servers and dedicated servers above 100GB to few TBs are still cheaper. Why would want to pay so high on cloud when there is no data guarentee as well as it doesnt say anything about redundancy. (I wish SO had something for spell check for Internet Explorer, sorry for wrong spellings in my post)

    Read the article

  • Multiset container appears to stop sorting

    - by Sarah
    I would appreciate help debugging some strange behavior by a multiset container. Occasionally, the container appears to stop sorting. This is an infrequent error, apparent in only some simulations after a long time, and I'm short on ideas. (I'm an amateur programmer--suggestions of all kinds are welcome.) My container is a std::multiset that holds Event structs: typedef std::multiset< Event, std::less< Event > > EventPQ; with the Event structs sorted by their double time members: struct Event { public: explicit Event(double t) : time(t), eventID(), hostID(), s() {} Event(double t, int eid, int hid, int stype) : time(t), eventID( eid ), hostID( hid ), s(stype) {} bool operator < ( const Event & rhs ) const { return ( time < rhs.time ); } double time; ... }; The program iterates through periods of adding events with unordered times to EventPQ currentEvents and then pulling off events in order. Rarely, after some events have been added (with perfectly 'legal' times), events start getting executed out of order. What could make the events ever not get ordered properly? (Or what could mess up the iterator?) I have checked that all the added event times are legitimate (i.e., all exceed the current simulation time), and I have also confirmed that the error does not occur because two events happen to get scheduled for the same time. I'd love suggestions on how to work through this. The code for executing and adding events is below for the curious: double t = 0.0; double nextTimeStep = t + EPID_DELTA_T; EventPQ::iterator eventIter = currentEvents.begin(); while ( t < EPID_SIM_LENGTH ) { // Add some events to currentEvents while ( ( *eventIter ).time < nextTimeStep ) { Event thisEvent = *eventIter; t = thisEvent.time; executeEvent( thisEvent ); eventCtr++; currentEvents.erase( eventIter ); eventIter = currentEvents.begin(); } t = nextTimeStep; nextTimeStep += EPID_DELTA_T; } void Simulation::addEvent( double et, int eid, int hid, int s ) { assert( currentEvents.find( Event(et) ) == currentEvents.end() ); Event thisEvent( et, eid, hid, s ); currentEvents.insert( thisEvent ); }

    Read the article

  • Java Flow Control Problem

    - by Kyle_Solo
    I am programming a simple 2d game engine. I've decided how I'd like the engine to function: it will be composed of objects containing "events" that my main game loop will trigger when appropriate. A little more about the structure: Every GameObject has an updateEvent method. objectList is a list of all the objects that will receive update events. Only objects on this list have their updateEvent method called by the game loop. I’m trying to implement this method in the GameObject class (This specification is what I’d like the method to achieve): /** * This method removes a GameObject from objectList. The GameObject * should immediately stop executing code, that is, absolutely no more * code inside update events will be executed for the removed game object. * If necessary, control should transfer to the game loop. * @param go The GameObject to be removed */ public void remove(GameObject go) So if an object tries to remove itself inside of an update event, control should transfer back to the game engine: public void updateEvent() { //object's update event remove(this); System.out.println("Should never reach here!"); } Here’s what I have so far. It works, but the more I read about using exceptions for flow control the less I like it, so I want to see if there are alternatives. Remove Method public void remove(GameObject go) { //add to removedList //flag as removed //throw an exception if removing self from inside an updateEvent } Game Loop for(GameObject go : objectList) { try { if (!go.removed) { go.updateEvent(); } else { //object is scheduled to be removed, do nothing } } catch(ObjectRemovedException e) { //control has been transferred back to the game loop //no need to do anything here } } // now remove the objects that are in removedList from objectList 2 questions: Am I correct in assuming that the only way to implement the stop-right-away part of the remove method as described above is by throwing a custom exception and catching it in the game loop? (I know, using exceptions for flow control is like goto, which is bad. I just can’t think of another way to do what I want!) For the removal from the list itself, it is possible for one object to remove one that is farther down on the list. Currently I’m checking a removed flag before executing any code, and at the end of each pass removing the objects to avoid concurrent modification. Is there a better, preferably instant/non-polling way to do this?

    Read the article

  • How do I use JDK 7 on Mac OSX?

    - by Yko
    OK. This is a newbie question but I can't figure it out... I would like to use the WatchService API as mentioned in this link: http://download.oracle.com/javase/tutorial/essential/io/notification.html After reading around, I found out that WatchService is part of the NIO class which is scheduled for JDK 7. So, it is in beta form. It's fine. http://jdk7.java.net/download.html has the JDK which I downloaded and extracted. I got a bunch of folders. I don't know what to do with them. Then, I read around some more and found that some nice group of people created JDK 7 as a binary so someone like me can install it easily. It is called Open JDK: http://code.google.com/p/openjdk-osx-build/ So, I downloaded the .dmg file and instal it. Then I open "Java Preference" and see that OpenJDK7 is available. So, now I feel that I can start trying out WatchService API. From the tutorial in the first link, the author gave a .java file to test it out first and make sure that it is running. Here is the link to the file: http://download.oracle.com/javase/tutorial/essential/io/examples/WatchDir.java So, I boot up Eclipse (actually I use STS) and create a new Java project and choose JaveSE-1.7 in the "use an execution environment JRE:". Under the src folder, I copy pasted the WatchDir.java file. And I still see tons of squiggly red lines. All the "import.java.nio.*" are all red and I cannot run it as a Java app. If you read this far, thanks a lot. So, now... What do I need to do? Thanks. EDIT: I actually did not pursue using Java 7 but there are a lot of interest in it and it seems like people keep answering this question. What should I do to make it more relevant to people who search for it? Let me know by PMing me. Thanks.

    Read the article

  • Improving performance for WRITE operation on Oracle DB in Java

    - by Lucky
    I've a typical scenario & need to understand best possible way to handle this, so here it goes - I'm developing a solution that will retrieve data from a remote SOAP based web service & will then push this data to an Oracle database on network. Also, this will be a scheduled task that will execute every 15 minutes. I've event queues on remote service that contains the INSERT/UPDATE/DELETE operations that have been done since last retrieval, & once I retrieve the events for last 15 minutes, it again add events for next retrieval. Now, its just pushing data to Oracle so all my interactions are INSERT & UPDATE statements. There are around 60 tables on Oracle with some of them having 100+ columns. Moreover, for every 15 minutes cycle there would be around 60-70 Inserts, 100+ Updates & 10-20 Deletes. This will be an executable jar file that will terminate after operation & will again start on next 15 minutes cycle. So, I need to understand how should I handle WRITE operations (best practices) to improve performance for this application as whole ? Current Test Code (on every cycle) - Connects to remote service to get events. Creates a connection with DB (single connection object). Identifies the type of operation (INSERT/UPDATE/DELETE) & table on which it is done. After above, calls the respective method based on type of operation & table. Uses Preparedstatement with positional parameters, & retrieves each column value from remote service & assigns that to statement parameters. Commits the statement & returns to get event class to process next event. Above is repeated till all the retrieved events are processed after which program closes & then starts on next cycle & everything repeats again. Thanks for help !

    Read the article

< Previous Page | 102 103 104 105 106 107 108 109 110 111 112 113  | Next Page >