Search Results

Search found 11010 results on 441 pages for 'txt record'.

Page 81/441 | < Previous Page | 77 78 79 80 81 82 83 84 85 86 87 88  | Next Page >

  • Cannot write the output to a text file in cpp program

    - by swapedoc
    I have this input file "https://code.google.com/codejam/contest/351101/dashboard/do/A-large-practice.in?cmd=GetInputFile&problem=374101&input_id=1&filename=A-large-practice.in&redownload_last=1&agent=website&csrfmiddlewaretoken=OWMxNTVmMTUyODBiYjhhN2Q2OTM3ZGJiMTNhNDkwMDF8fDEzNzIxNzI1NTE3ODAzMjA%3D" I tried to read this file :-using freopen("filename.txt",r,stdin); and then I wanted the output written to be written to another text file which I can upload in this codejam practice question for the judge. #include<iostream> #include<cstdio> using namespace std; int main() { int t,k=0,a[2000]; freopen("ab.txt","r",stdin); scanf("%d",&t); while(t--) { freopen("cb.txt","w",stdout); int c; scanf("%d",&c); int n; scanf("%d",&n); for(int i=0;i<n;i++) scanf("%d",&a[i]); printf("Case #%d: ",++k); for(int i=0;i<n-1;i++) {for(int j=i+1;j<n;j++) if((a[i]+a[j])==c) {printf("%d %d\n",i+1,j+1); i=n;} } } return 0; } This is my code. Now the problem is the output file cb.txt contains only the last line of the input. I want the the whole of the output to be written to cb.txt,so what should I do.

    Read the article

  • Diffrernce between BackgroundWorker.ReportProgress() and Control.Invoke()

    - by ohadsc
    What is the difference between options 1 and 2 in the following? private void BGW_DoWork(object sender, DoWorkEventArgs e) { for (int i=1; i<=100; i++) { string txt = i.ToString(); if (Test_Check.Checked) //OPTION 1 Test_BackgroundWorker.ReportProgress(i, txt); else //OPTION 2 this.Invoke((Action<int, string>)UpdateGUI, new object[] {i, txt}); } } private void BGW_ProgressChanged(object sender, ProgressChangedEventArgs e) { UpdateGUI(e.ProgressPercentage, (string)e.UserState); } private void UpdateGUI(int percent, string txt) { Test_ProgressBar.Value = percent; Test_RichTextBox.AppendText(txt + Environment.NewLine); } Looking at reflector, the Control.Invoke() appears to use: this.FindMarshalingControl().MarshaledInvoke(this, method, args, 1); whereas BackgroundWorker.Invoke() appears to use: this.asyncOperation.Post(this.progressReporter, args); (I'm just guessing these are the relevant function calls.) If I understand correctly, BGW Posts to the WinForms window its progress report request, whereas Control.Invoke uses a CLR mechanism to invoke on the right thread. Am I close? And if so, what are the repercussions of using either ? Thanks

    Read the article

  • How get file path from std::ifstream in c++

    - by Nilesh Shinge
    Hi All I am opening file using C++ function std::ifstream. I may open file using relative path (file.txt) or absoulte path (C:\test\file.txt). As I am passing string as file name, dont know whether it is relative or absolute path. Can any body tell me how to get absolutre path after file has been successfully open using std::ifstream function. e.g. std::ifstream file(strFile); // strFile = "file.txt" or strFile = "C:\test\file.txt" I want absolute path after file open successfully. Thanks Nilesh

    Read the article

  • The conceptual process of populating related tables in a database (MySql) from a CSV file

    - by user322772
    I'm new to relational databases and all the material I've read covered primary and foreign keys, normal forms, and joins but left out to populate the database once it's created. How do you import a CSV file so the fields match their related table? Say you were tying to build a beer database and had a CSV file with each line as a record. Header: brewer, beer_name, country, city, state, beer_category, beer_type, alcohol_content Record 1: Anheuser-Busch, Budweiser, United States, St. Louis, Mo, Pale lager, Regular, 5.0% Record 2: Anheuser-Busch, Bud Light, United States, St. Louis, Mo, Pale lager Light, 4.2% Record 3: Miller Brewing Company, Miller Lite, United States, Milwaukee, WI, Pale lager, Light, 4.2% You can create a "Brewer" table and a "Beer" table. When importing how does you connect the primary keys between the tables?

    Read the article

  • Node.js Creating and Deleting a File Recursively

    - by Matt
    I thought it would be a cool experiment to have a for loop and create a file hello.txt and then delete it with unlink. I figured that if fs.unlink is the delete file procedure in Node, then fs.link must be the create file. However, my code will only delete, and it will not create, not even once. Even if I separate the fs.link code into a separate file, it still will not create my file hello.txt. Below is my code: var fs = require('fs'), for(var i=1;i<=10;i++){ fs.unlink('./hello.txt', function (err) { if (err){ throw err; } else { console.log('successfully deleted file'); } fs.link('./hello.txt', function (err) { if (err){ throw err; } else { console.log('successfully created file'); } }); }); } http://nodejs.org/api/fs.html#fs_fs_link_srcpath_dstpath_callback Thank you!

    Read the article

  • Should I be using callbacks or should I override attributes?

    - by ryeguy
    What is the more "rails-like"? If I want to modify a model's property when it's set, should I do this: def url=(url) #remove session id self[:url] = url.split('?s=')[0] end or this? before_save do |record| #remove session id record.url = record.url.split('?s=')[0] end Is there any benefit for doing it one way or the other? If so, why? If not, which one is generally more common?

    Read the article

  • system() copy fails, while cmd copy works

    - by Mike
    In cmd.exe, I can execute the command "copy c:\hello.txt c:\hello2.txt" and it worked fine. But in my C program, I ran this piece of code and got the following error: #include <iostream> using namespace std; int main() { system("copy c:\hello.txt c:\hello2.txt"); system("pause"); return 0; } Output: The system cannot find the file specified. Anybody know what is going on here?

    Read the article

  • C#: My callback function gets called twice for every Sent Request

    - by Madi D.
    I've Got a program that uploads/downloads files into an online server,Has a callback to report progress and log it into a textfile, The program is built with the following structure: public void Upload(string source, string destination) { //Object containing Source and destination to pass to the threaded function KeyValuePair<string, string> file = new KeyValuePair<string, string>(source, destination); //Threading to make sure no blocking happens after calling upload Function Thread t = new Thread(new ParameterizedThreadStart(amazonHandler.TUpload)); t.Start(file); } private void TUpload(object fileInfo) { KeyValuePair<string, string> file = (KeyValuePair<string, string>)fileInfo; /* Some Magic goes here,Checking The file and Authorizing Upload */ var ftiObject = new FtiObject () { FileNameOnHDD = file.Key, DestinationPath = file.Value, //Has more data used for calculations. }; //Threading to make sure progress gets callback gets called. Thread t = new Thread(new ParameterizedThreadStart(amazonHandler.UploadOP)); t.Start(ftiObject); //Signal used to stop progress untill uploadCompleted is called. uploadChunkDoneSignal.WaitOne(); /* Some Extra Code */ } private void UploadOP(object ftiSentObject) { FtiObject ftiObject = (FtiObject)ftiSentObject; /* Some useless code to create the uri and prepare the ftiObject. */ // webClient.UploadFileAsync will open a thread that // will upload the file and report // progress/complete using registered callback functions. webClient.UploadFileAsync(uri, "PUT", ftiObject.FileNameOnHDD, ftiObject); } I got a callback that is registered to the Webclient's UploadProgressChanged event , however it is getting called twice per sent request. void UploadProgressCallback(object sender, UploadProgressChangedEventArgs e) { FtiObject ftiObject = (FtiObject )e.UserState; Logger.log(ftiObject.FileNameOnHDD, (double)e.BytesSent ,e.TotalBytesToSend); } Log Output: Filename: C:\Text1.txt Uploaded:1024 TotalFileSize: 665241 Filename: C:\Text1.txt Uploaded:1024 TotalFileSize: 665241 Filename: C:\Text1.txt Uploaded:2048 TotalFileSize: 665241 Filename: C:\Text1.txt Uploaded:2048 TotalFileSize: 665241 Filename: C:\Text1.txt Uploaded:3072 TotalFileSize: 665241 Filename: C:\Text1.txt Uploaded:3072 TotalFileSize: 665241 Etc... I am watching the Network Traffic using a watcher, and only 1 request is being sent. Some how i cant Figure out why the callback is being called twice, my doubt was that the callback is getting fired by each thread opened(the main Upload , and TUpload), however i dont know how to test if thats the cause. Note: The reason behind the many /**/ Comments is to indicate that the functions do more than just opening threads, and threading is being used to make sure no blocking occurs (there a couple of "Signal.WaitOne()" around the code for synchronization)

    Read the article

  • array of structures, or structure of arrays?

    - by Jason S
    Hmmm. I have a table which is an array of structures I need to store in Java. The naive don't-worry-about-memory approach says do this: public class Record { final private int field1; final private int field2; final private long field3; /* constructor & accessors here */ } List<Record> records = new ArrayList<Record>(); If I end up using a large number ( 106 ) of records, where individual records are accessed occasionally, one at a time, how would I figure out how the preceding approach (an ArrayList) would compare with an optimized approach for storage costs: public class OptimizedRecordStore { final private int[] field1; final private int[] field2; final private long[] field3; Record getRecord(int i) { return new Record(field1[i],field2[i],field3[i]); } /* constructor and other accessors & methods */ } edit: assume the # of records is something that is changed infrequently or never I'm probably not going to use the OptimizedRecordStore approach, but I want to understand the storage cost issue so I can make that decision with confidence. obviously if I add/change the # of records in the OptimizedRecordStore approach above, I either have to replace the whole object with a new one, or remove the "final" keyword. kd304 brings up a good point that was in the back of my mind. In other situations similar to this, I need column access on the records, e.g. if field1 and field2 are "time" and "position", and it's important for me to get those values as an array for use with MATLAB, so I can graph/analyze them efficiently.

    Read the article

  • Tail recursion and memoization with C#

    - by Jay
    I'm writing a function that finds the full path of a directory based on a database table of entries. Each record contains a key, the directory's name, and the key of the parent directory (it's the Directory table in an MSI if you're familiar). I had an iterative solution, but it started looking a little nasty. I thought I could write an elegant tail recursive solution, but I'm not sure anymore. I'll show you my code and then explain the issues I'm facing. Dictionary<string, string> m_directoryKeyToFullPathDictionary = new Dictionary<string, string>(); ... private string ExpandDirectoryKey(Database database, string directoryKey) { // check for terminating condition string fullPath; if (m_directoryKeyToFullPathDictionary.TryGetValue(directoryKey, out fullPath)) { return fullPath; } // inductive step Record record = ExecuteQuery(database, "SELECT DefaultDir, Directory_Parent FROM Directory where Directory.Directory='{0}'", directoryKey); // null check string directoryName = record.GetString("DefaultDir"); string parentDirectoryKey = record.GetString("Directory_Parent"); return Path.Combine(ExpandDirectoryKey(database, parentDirectoryKey), directoryName); } This is how the code looked when I realized I had a problem (with some minor validation/massaging removed). I want to use memoization to short circuit whenever possible, but that requires me to make a function call to the dictionary to store the output of the recursive ExpandDirectoryKey call. I realize that I also have a Path.Combine call there, but I think that can be circumvented with a ... + Path.DirectorySeparatorChar + .... I thought about using a helper method that would memoize the directory and return the value so that I could call it like this at the end of the function above: return MemoizeHelper( m_directoryKeyToFullPathDictionary, Path.Combine(ExpandDirectoryKey(database, parentDirectoryKey)), directoryName); But I feel like that's cheating and not going to be optimized as tail recursion. Any ideas? Should I be using a completely different strategy? This doesn't need to be a super efficient algorithm at all, I'm just really curious. I'm using .NET 4.0, btw. Thanks!

    Read the article

  • running Echo from Java

    - by ripper234
    I'm trying out the Runtime.exec() method to run a command line process. I wrote this sample code, which runs without problems but doesn't produce a file at c:\tmp.txt. String cmdLine = "echo foo > c:\\tmp.txt"; Runtime rt = Runtime.getRuntime(); Process pr = rt.exec(cmdLine); BufferedReader input = new BufferedReader( new InputStreamReader(pr.getInputStream())); String line; StringBuilder output = new StringBuilder(); while ((line = input.readLine()) != null) { output.append(line); } int exitVal = pr.waitFor(); logger.info(String.format("Ran command '%s', got exit code %d, output:\n%s", cmdLine, exitVal, output)); The output is INFO 21-04 20:02:03,024 - Ran command 'echo foo c:\tmp.txt', got exit code 0, output: foo c:\tmp.txt

    Read the article

  • Capture video from WPF app

    - by Julien Couvreur
    I want to write a C# application which can record a video capture of one of its WPF controls. Is there a solution in .Net to record video from a control, or is there some library I could use? My goal is to write a SketchCast application. The use case is the following: launch SketchCast app and press record button, write ink into a WPF ink area, and talk, press stop, recorded voice and ink animation get saved into a video file in some encoding.

    Read the article

  • Automatically copy files without overwriting, but creating numbered ones instead

    - by user1688322
    I need to copy files at a regular interval, eg once an hour so I tried setting up an xcopy batch saying it should copy the files it needs to copy to another folder. Now when it copies, it overwrites the files which is not what it is supposed to do. When a file is copied, it should create a new file instead, named something like File.txt, File-COPY1.txt, File-COPY2.txt or something like that. Any way to do that? Thanks in advance.

    Read the article

  • UNIX script to parse Zone file (is this the best code?)

    - by Steve
    Hi, FOund the following on: http://mike.murraynet.net/2009/08/23/parsing-the-verisign-zone-file-with-os-x/ Can unix-masters have a look at it and see if its the best possible way to gather the unique domainsnames in a zone file? For .NET domains: grep “^[a-zA-Z0-9-]+ NS .” net.zone|sed “s/NS .//”|uniq netdomains.txt For .COM domains: grep “^[a-zA-Z0-9-]+ NS .” com.zone|sed “s/NS .//”|uniq comdomains.txt For .EDU domains: grep “^[a-zA-Z0-9-]+ NS .” edu.zone|sed “s/NS .//”|uniq edudomains.txt

    Read the article

  • Yank file name / path of current buffer in Vim

    - by Dave Tapley
    Assuming the current buffer is a file open for edit, so :e does not display E32: No file name. I would like to yank one or all of: The file name exactly as show on the status line, e.g. ~\myfile.txt A full path to the file, e.g. c:\foo\bar\myfile.txt Just the file name, e.g. myfile.txt

    Read the article

  • Function for putting all database table to an array

    - by jasmine
    I have written a function to print database table to an array like this $db_array= Array( ID=>1, PARENTID =>1, TITLE => LIPSUM, TEXT =>LIPSUM ) My function is: function dbToArray($table) { $allArrays =array(); $query = mysql_query("SELECT * FROM $table"); $dbRow = mysql_fetch_array($query); for ($i=0; $i<count($dbRow) ; $i++) { $allArrays[$i] = $dbRow; } $txt .='<pre>'; $txt .= print_r($allArrays); $txt .= '</pre>'; return $txt; } Anything wrong in my function. Any help is appreciated about my problem. Thanks in advance

    Read the article

  • System.Threading.Timer Doesn't Trigger my TimerCallBack Delegate

    - by Tom Kong
    Hi, I am writing my first Windows Service using C# and I am having some trouble with my Timer class. When the service is started, it runs as expected but the code will not execute again (I want it to run every minute) Please take a quick look at the attached source and let me know if you see any obvious mistakes! TIA using System; using System.Collections.Generic; using System.ComponentModel; using System.Data; using System.Diagnostics; using System.Linq; using System.ServiceProcess; using System.Text; using System.Threading; using System.IO; namespace CXO001 { public partial class Service1 : ServiceBase { public Service1() { InitializeComponent(); } /* * Aim: To calculate and update the Occupancy values for the different Sites * * Method: Retrieve data every minute, updating a public value which can be polled */ protected override void OnStart(string[] args) { Daemon(); } public void Daemon() { TimerCallback tcb = new TimerCallback(On_Tick); TimeSpan duetime = new TimeSpan(0, 0, 1); TimeSpan interval = new TimeSpan(0, 1, 0); Timer querytimer = new Timer(tcb, null, duetime, interval); } protected override void OnStop() { } static int[] floorplanids = new int[] { 115, 114, 107, 108 }; public static List<Record> Records = new List<Record>(); static bool firstrun = true; public static void On_Tick(object timercallback) { //Update occupancy data for the last minute //Save a copy of the public values to HDD with a timestamp string starttime; if (Records.Count > 0) { starttime = Records.Last().TS; firstrun = false; } else { starttime = DateTime.Today.AddHours(7).ToString(); firstrun = true; } DateTime endtime = DateTime.Now; GetData(starttime, endtime); } public static void GetData(string starttime, DateTime endtime) { string connstr = "Data Source = 192.168.1.123; Initial Catalog = Brickstream_OPS; User Id = Brickstream; Password = bstas;"; DataSet resultds = new DataSet(); //Get the occupancy for each Zone foreach (int zone in floorplanids) { SQL s = new SQL(); string querystr = "SELECT SUM(DIRECTIONAL_METRIC.NUM_TO_ENTER - DIRECTIONAL_METRIC.NUM_TO_EXIT) AS 'Occupancy' FROM REPORT_OBJECT INNER JOIN REPORT_OBJ_METRIC ON REPORT_OBJECT.REPORT_OBJ_ID = REPORT_OBJ_METRIC.REPORT_OBJECT_ID INNER JOIN DIRECTIONAL_METRIC ON REPORT_OBJ_METRIC.REP_OBJ_METRIC_ID = DIRECTIONAL_METRIC.REP_OBJ_METRIC_ID WHERE (REPORT_OBJ_METRIC.M_START_TIME BETWEEN '" + starttime + "' AND '" + endtime.ToString() + "') AND (REPORT_OBJECT.FLOORPLAN_ID = '" + zone + "');"; resultds = s.Go(querystr, connstr, zone.ToString(), resultds); } List<Record> result = new List<Record>(); int c = 0; foreach (DataTable dt in resultds.Tables) { Record r = new Record(); r.TS = DateTime.Now.ToString(); r.Zone = dt.TableName; if (!firstrun) { r.Occupancy = (dt.Rows[0].Field<int>("Occupancy")) + (Records[c].Occupancy); } else { r.Occupancy = dt.Rows[0].Field<int>("Occupancy"); } result.Add(r); c++; } Records = result; MrWriter(); } public static void MrWriter() { StringBuilder output = new StringBuilder("Time,Zone,Occupancy\n"); foreach (Record r in Records) { output.Append(r.TS); output.Append(","); output.Append(r.Zone); output.Append(","); output.Append(r.Occupancy.ToString()); output.Append("\n"); } output.Append(firstrun.ToString()); output.Append(DateTime.Now.ToFileTime()); string filePath = @"C:\temp\CXO.csv"; File.WriteAllText(filePath, output.ToString()); } } }

    Read the article

  • JQuery: how to store records id for data from ajax query in dynamicly created html elements

    - by grapkulec
    Probably question title is rather cryptic but I will try to explain myself here so please bare with me :) Let's assume this configuration: server-side is PHP application responding for requests with data (list of items, single item details, etc.) in json format client-side is JQuery application sending ajax request to that PHP app and creating html content corresponding with received data So, for example: client requests "list of all animals with names staring with 'A'", gets the json response from server, and for every "animal" creates some html gizmo like div with animal description or something like that. It doesn't really matter what html element it will be but it has to point exactly to specific record by "containing" id of that record. And here is my dilemma: is it good solution to use "id" property for that? So it would be like: <div id="10" class="animal"> <p> This is animal of very mysterious kind... </p> </div> <div id="11" class="animal"> <p> And this one is very common to our country... </p> </div> where id="10" is of course indication that this is representation of record with id = 10. Or maybe I should store this record id in some custom made tag like <record_id>10</record_id> and leave an "id" strictly for what it was meant to be (css selector)? I need that record id for further stuff like updating database with some user input or deleting some of "animals" or creating new ones or anything that will be needed. All manipulations will be done with JQuery and ajax requests and responses will be visualized also with dynamic creation of html interface. I'm sure that somebody had to deal with that kind of stuff before so I would be grateful for some tips on that topic.

    Read the article

  • Moving and renaming files, keeping extension but include sub directories in batch file

    - by Ser1esII
    Forgive me if this is nor the place to ask these questions, I am new to batch and scripts and a bit new to these kind of posts... I have a folder that will receive files and folders, I want to run a script that looks at the directory and renames all files in each subfolder numerically, and moves them if possible. For example I have something that looks like the following Recieved_File_Folder |_folder1 | |_file1.txt | |_file2.bmp |_folder2 | |_file4.exe | |_file5.bmp |__file9.txt |__file10.jpg I would like to be able to look in every directory and move it to something like this, keeping in mind the names of the files will be random and I want to keep the extension intact also. Renamed_Folder |_folder1 | |_1.txt | |_2.bmp |_folder2 | |_1.exe | |_2.bmp |__1.txt |__2.jpg I have spent alot of time on this and am not doing too well with it, any help would be very greatly appreciated!! Thank you in advance!

    Read the article

  • ssh & script problem

    - by Nishanth
    I am having a strange problem while doing ssh. I am not sure where the term Unmatched ` is coming from. What I need to do is run script that logs information of what I am doing on the terminal to text file. After ssh - Sun Microsystems Inc. SunOS 5.8 Generic Patch October 2001 This is /etc/motd, last updated 3 Feb 2003. To learn about the UCS system and other aspects of computing at UL-Lafayette visit our home page http://helpdesk.louisiana.edu/ . For more information about system use, contact the Help Desk, Stephens Hall, Room 201, 482-5516 (x25516), during normal UL office hours; or send e-mail to [email protected]. ATTENTION: Unsecure Telnet and FTP will be turned off soon. Please make arrange to use ssh or sftp. Putty(telnet) and WinSCP(ftp) would be a good replacement. Unmatched ` d13.ucs.louisiana.edu% bash bash-2.04$ script -a myInformation.txt Script started, file is myInformation.txt Unmatched ` d13.ucs.louisiana.edu% When I tried to start the script with name myInformation.txt, you can see the message I am getting - Script started, file is myInformation.txt. But again I am getting that message Unmatched ` and is coming out of bash, as you can notice. What is the problem ? Any insights suggested would be very great. Note: file with name myInformation.txt is being created but nothing goes in to it. As I have even tried running certain commands like ls and then exited the script with ctrl+d. But when I open the file, nothing is there.

    Read the article

  • how to fetch exact string from a text file

    - by user136104
    Hi All, here is me requirement. I have one text file and I need to get exact specified string from that file C:>type small.txt | find "small2" small2 small22 small20 C:>type small.txt small2 small22 small20 C:>type small.txt | find "small2" small2 small22 small20 C:\ Here it is giving all the words instead of only "small2" Plz help me in this.. it's urgent Thanks in advance

    Read the article

  • jsf, richfaces, popup window

    - by Hubidubi
    Hi I would like to make a list-detail view with richfaces. There will be a link for every record in the list that should open a new window containing record details. I tried to implement the link this way: <a4j:commandLink oncomplete="window.open('/pages/serviceDetail.jsf','popupWindow', 'dependent=yes, menubar=no, toolbar=no, height=500, width=400')" actionListener="#{monitoringBean.recordDetail}" value="details" /> I use <a4j:keepAlive beanName="monitoringBean" ajaxOnly="false" /> for both the list and the detail page. recordDetail method fills the data of the selected record to a variable of the bean that I would like to display on the detail page. The problem is that keepalive doesn't work, so I get new bean instance on the detail page every time. So the the previously selected record from the other bean is not accessible here. Is there a way to pass parameter (id) to the detail page to handle record selection. Or is there any way to make keepalive work? (I this this would be the easiest). Thanks

    Read the article

  • Optimizing python code performance when importing zipped csv to a mongo collection

    - by mark
    I need to import a zipped csv into a mongo collection, but there is a catch - every record contains a timestamp in Pacific Time, which must be converted to the local time corresponding to the (longitude,latitude) pair found in the same record. The code looks like so: def read_csv_zip(path, timezones): with ZipFile(path) as z, z.open(z.namelist()[0]) as input: csv_rows = csv.reader(input) header = csv_rows.next() check,converters = get_aux_stuff(header) for csv_row in csv_rows: if check(csv_row): row = { converter[0]:converter[1](value) for converter, value in zip(converters, csv_row) if allow_field(converter) } ts = row['ts'] lng, lat = row['loc'] found_tz_entry = timezones.find_one(SON({'loc': {'$within': {'$box': [[lng-tz_lookup_radius, lat-tz_lookup_radius],[lng+tz_lookup_radius, lat+tz_lookup_radius]]}}})) if found_tz_entry: tz_name = found_tz_entry['tz'] local_ts = ts.astimezone(timezone(tz_name)).replace(tzinfo=None) row['tz'] = tz_name else: local_ts = (ts.astimezone(utc) + timedelta(hours = int(lng/15))).replace(tzinfo = None) row['local_ts'] = local_ts yield row def insert_documents(collection, source, batch_size): while True: items = list(itertools.islice(source, batch_size)) if len(items) == 0: break; try: collection.insert(items) except: for item in items: try: collection.insert(item) except Exception as exc: print("Failed to insert record {0} - {1}".format(item['_id'], exc)) def main(zip_path): with Connection() as connection: data = connection.mydb.data timezones = connection.timezones.data insert_documents(data, read_csv_zip(zip_path, timezones), 1000) The code proceeds as follows: Every record read from the csv is checked and converted to a dictionary, where some fields may be skipped, some titles be renamed (from those appearing in the csv header), some values may be converted (to datetime, to integers, to floats. etc ...) For each record read from the csv, a lookup is made into the timezones collection to map the record location to the respective time zone. If the mapping is successful - that timezone is used to convert the record timestamp (pacific time) to the respective local timestamp. If no mapping is found - a rough approximation is calculated. The timezones collection is appropriately indexed, of course - calling explain() confirms it. The process is slow. Naturally, having to query the timezones collection for every record kills the performance. I am looking for advises on how to improve it. Thanks. EDIT The timezones collection contains 8176040 records, each containing four values: > db.data.findOne() { "_id" : 3038814, "loc" : [ 1.48333, 42.5 ], "tz" : "Europe/Andorra" } EDIT2 OK, I have compiled a release build of http://toblerity.github.com/rtree/ and configured the rtree package. Then I have created an rtree dat/idx pair of files corresponding to my timezones collection. So, instead of calling collection.find_one I call index.intersection. Surprisingly, not only there is no improvement, but it works even more slowly now! May be rtree could be fine tuned to load the entire dat/idx pair into RAM (704M), but I do not know how to do it. Until then, it is not an alternative. In general, I think the solution should involve parallelization of the task. EDIT3 Profile output when using collection.find_one: >>> p.sort_stats('cumulative').print_stats(10) Tue Apr 10 14:28:39 2012 ImportDataIntoMongo.profile 64549590 function calls (64549180 primitive calls) in 1231.257 seconds Ordered by: cumulative time List reduced from 730 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.012 0.012 1231.257 1231.257 ImportDataIntoMongo.py:1(<module>) 1 0.001 0.001 1230.959 1230.959 ImportDataIntoMongo.py:187(main) 1 853.558 853.558 853.558 853.558 {raw_input} 1 0.598 0.598 370.510 370.510 ImportDataIntoMongo.py:165(insert_documents) 343407 9.965 0.000 359.034 0.001 ImportDataIntoMongo.py:137(read_csv_zip) 343408 2.927 0.000 287.035 0.001 c:\python27\lib\site-packages\pymongo\collection.py:489(find_one) 343408 1.842 0.000 274.803 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:699(next) 343408 2.542 0.000 271.212 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:644(_refresh) 343408 4.512 0.000 253.673 0.001 c:\python27\lib\site-packages\pymongo\cursor.py:605(__send_message) 343408 0.971 0.000 242.078 0.001 c:\python27\lib\site-packages\pymongo\connection.py:871(_send_message_with_response) Profile output when using index.intersection: >>> p.sort_stats('cumulative').print_stats(10) Wed Apr 11 16:21:31 2012 ImportDataIntoMongo.profile 41542960 function calls (41542536 primitive calls) in 2889.164 seconds Ordered by: cumulative time List reduced from 778 to 10 due to restriction <10> ncalls tottime percall cumtime percall filename:lineno(function) 1 0.028 0.028 2889.164 2889.164 ImportDataIntoMongo.py:1(<module>) 1 0.017 0.017 2888.679 2888.679 ImportDataIntoMongo.py:202(main) 1 2365.526 2365.526 2365.526 2365.526 {raw_input} 1 0.766 0.766 502.817 502.817 ImportDataIntoMongo.py:180(insert_documents) 343407 9.147 0.000 491.433 0.001 ImportDataIntoMongo.py:152(read_csv_zip) 343406 0.571 0.000 391.394 0.001 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:384(intersection) 343406 379.957 0.001 390.824 0.001 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:435(_intersection_obj) 686513 22.616 0.000 38.705 0.000 c:\python27\lib\site-packages\rtree-0.7.0-py2.7.egg\rtree\index.py:451(_get_objects) 343406 6.134 0.000 33.326 0.000 ImportDataIntoMongo.py:162(<dictcomp>) 346 0.396 0.001 30.665 0.089 c:\python27\lib\site-packages\pymongo\collection.py:240(insert) EDIT4 I have parallelized the code, but the results are still not very encouraging. I am convinced it could be done better. See my own answer to this question for details.

    Read the article

< Previous Page | 77 78 79 80 81 82 83 84 85 86 87 88  | Next Page >