Search Results

Search found 37883 results on 1516 pages for 'sparse files'.

Page 106/1516 | < Previous Page | 102 103 104 105 106 107 108 109 110 111 112 113  | Next Page >

  • Efficient file buffering & scanning methods for large files in python

    - by eblume
    The description of the problem I am having is a bit complicated, and I will err on the side of providing more complete information. For the impatient, here is the briefest way I can summarize it: What is the fastest (least execution time) way to split a text file in to ALL (overlapping) substrings of size N (bound N, eg 36) while throwing out newline characters. I am writing a module which parses files in the FASTA ascii-based genome format. These files comprise what is known as the 'hg18' human reference genome, which you can download from the UCSC genome browser (go slugs!) if you like. As you will notice, the genome files are composed of chr[1..22].fa and chr[XY].fa, as well as a set of other small files which are not used in this module. Several modules already exist for parsing FASTA files, such as BioPython's SeqIO. (Sorry, I'd post a link, but I don't have the points to do so yet.) Unfortunately, every module I've been able to find doesn't do the specific operation I am trying to do. My module needs to split the genome data ('CAGTACGTCAGACTATACGGAGCTA' could be a line, for instance) in to every single overlapping N-length substring. Let me give an example using a very small file (the actual chromosome files are between 355 and 20 million characters long) and N=8 import cStringIO example_file = cStringIO.StringIO("""\ header CAGTcag TFgcACF """) for read in parse(example_file): ... print read ... CAGTCAGTF AGTCAGTFG GTCAGTFGC TCAGTFGCA CAGTFGCAC AGTFGCACF The function that I found had the absolute best performance from the methods I could think of is this: def parse(file): size = 8 # of course in my code this is a function argument file.readline() # skip past the header buffer = '' for line in file: buffer += line.rstrip().upper() while len(buffer) = size: yield buffer[:size] buffer = buffer[1:] This works, but unfortunately it still takes about 1.5 hours (see note below) to parse the human genome this way. Perhaps this is the very best I am going to see with this method (a complete code refactor might be in order, but I'd like to avoid it as this approach has some very specific advantages in other areas of the code), but I thought I would turn this over to the community. Thanks! Note, this time includes a lot of extra calculation, such as computing the opposing strand read and doing hashtable lookups on a hash of approximately 5G in size. Post-answer conclusion: It turns out that using fileobj.read() and then manipulating the resulting string (string.replace(), etc.) took relatively little time and memory compared to the remainder of the program, and so I used that approach. Thanks everyone!

    Read the article

  • how to find files in a given branch

    - by Haiyuan Zhang
    I noticed that when doing code view, people here in my company usually just give the branch in which his work is done, and nothing else. So I guess there must be a easy way to find out all the files that has a version in the given branch which is the same thing to find all the files that has been the changed. Yes, I don't know the expected "easy way" to find files in certain branch, so need your help and thanks in advance.

    Read the article

  • RealPath returns an empty string

    - by Abs
    Hello all, I have the following which just loops through the files in a directory and echo the file names. However, when I use realpath, it returns nothing. What am I doing wrong: if ($handle = opendir($font_path)) { while (false !== ($file = readdir($handle))) { if ($file != "." && $file != ".." && $file != "a.zip") { echo $file.'<br />';//i can see file names fine echo realpath($file);// return empty string?! } } closedir($handle); } Thanks all for any help on this. ~I am on a windows machine, running php 5.3 and apache 2.2.

    Read the article

  • CVS list of files only in working directories

    - by Joshua Berry
    Is it possible to get a list of files that are in the working directory tree, but not in the current branch/tag? I currently diff the working copy with another directory updated to the same module and tag/branch but without the local non-repo files. It works, but doesn't honor the .cvsignore files. I figure there must be an option using a variation of 'cvs diff'. Thanks in advance.

    Read the article

  • Using windows CopyFile function to copy all files with certain name format

    - by Ben313
    Hello! I am updating some C code that copys files with a certain name. basically, I have a directory with a bunch of files named like so: AAAAA.1.XYZ AAAAA.2.ZYX AAAAA.3.YZX BBBBB.1.XYZ BBBBB.2.ZYX Now, In the old code, they just used a call to ShellExecute and used xcopy.exe. to get all the files starting with AAAAA, they just gave xcopy the name of the file as AAAAA.* and it knew to copy all of the files starting with AAAAA. now, im trying to get it to copy with out having to use the command line, and I am running into trouble. I was hoping CopyFile would be smart enough to handle AAAAA.* as the file to be copied, but it doesnt at all do what xcopy did. So, any Ideas on how to do this without the external call to xcopy.exe?

    Read the article

  • Protect Files from Git

    - by Tanner
    I'm using Git with WindRiver to manage a project of mine. The code is being managed, however the project files (such as .cproject, .project, .wrmakefile, and .wrproject) are not. However when I switch branches, Git deletes those files spite them being in .gitignore, thereby removing my ability to compile the code without having to revert commits or keeping a backup. So, is there a way to say to Git - ignore these files and don't touch them no matter what?

    Read the article

  • Wrong owner and group for files created under a samba shared directory

    - by agmao
    I am trying to make writing to a shared samba directory work. I got a very weird problem. Now the shared directory is writable from a client machine. But the files created under the samba share directory have weird owner and group names. I am writing to the shared directory as user mike under the client machine, but the file created always has user and group name as steve instead... Does anybody know why that would happen...? Another thing I just noticed is that on the samba server, the files have owner and user name as samba, which I created for samba clients. Thanks a lot

    Read the article

  • How to ignore the .classpath for Eclipse projects using Mercurial?

    - by Feanor
    I'm trying to share a repository between my Mac (laptop) and PC (desktop). There are some external dependencies for the project that are stored on different places on each machine, and noted in the .classpath file in the Eclipse project. When the project changes are shared, the dependencies break. I'm trying to figure out how to keep this from happening. I've tried using .hgignore with the following settings, among others, without success: syntax: glob *.classpath Based on this question, it appears that the .hgignore file will not allow Mercurial to ignore files that are also committed to the repository. Is there another way around this? Other ways to configure the project to make it work?

    Read the article

  • Mysterious xyz.event files appearing

    - by Pekka
    I am getting mysterious .event files - always empty, created by me a few weeks ago - in several local project directories. They are all Subversion checkouts. They are always named after the directory they reside in, so a directory named pagination will contain a pagination.event file. Does anybody know what this is? Possibly important information: I am working on a Windows 7 Workstation I use NuSphere's PHP IDE (no updates recently) I use TortoiseSVN for version control I set up a Windows 7 backup job recently that ran once, I can' remember when exactly. The event files seem to turn up only in repositories There is no external access to those repositories

    Read the article

  • How to programmatically cut/copy/get files to/from Windows clipboard in a systam standard compliand

    - by Ivan
    How to put a cut/copy reference to specific files and/or folders into Windows clipboard so that when I open standard Windows Explorer window, go to somewhere and press Ctrl+V - the files are pasted? If I copy or cut some files/folders in Windows Explorer, how do I get this info (full names and whether they were cut or copied) in my Program? I program in C#4, but other languages ways are also interesting to know.

    Read the article

  • Can't access my files in ASP.NET web site

    - by jumbojs
    I'm having a very difficult time. I am running windows 2008 server, I have an Able Commerce site using ASP.NET with C#. I'm writing an automated task that will ftp some xml files down into a local directory on our web server and then the program parses the xml file and saves information to our database. The problem, once I save the files to our local directory, my program has no access to the files. The NETWORK SERVICE user permissions isn't being inherited by the xml files so my program can't do anything with them. I can manually change the permissions, but this wouldn't be automated and won't work. How can I get this to work? help please, it's very frustrating.

    Read the article

  • Windows script to create directories of 3,000 files

    - by uhpl1
    We have some email archiving that is dumping all the emails into a directory. Because of some performance reasons with the server, I want to setup an automated task that will run a script once a day and if there is more than 3,000 (or whatever number) of files in the main directory, create a new directory with the date and move all the main directory files into it. I'm sure someone has already written something similar, so if anyone could point me at it that would be great. Batch file or Powershell would both be fine.

    Read the article

  • Combine and compress script files in asp.net mvc

    - by victor_foster
    I am working in Visual Studio 2008, IIS7 and using asp.net MVC. I would like to know the best way to combine all of my Javascript files into one file to reduce the number of HTTP requests to the server. I have seen many articles on this subject but I'm not sure which one I should look at first (many of them are over a year old). Here are the things I would like to do: Combine my Javascript and css files Safely compress my Javascript files when I publish, but keep them uncompressed while I am debugging Cache my Css and Javascript files but allow them to refreshed with a hard refresh when they are updated without having to rename them.

    Read the article

  • Seemingly random 404's for static files in Pyramid project

    - by seth
    I'm running a Pyramid project with mod_wsgi. Some of the files in my static directory (images, stylesheets, javascript) load fine, but others are coming up as not found. The files that are not working are all web fonts (otf, svg, woff and eot). I tried adding a text file into the static directory where the fonts are to see if I could access it, but it also came back with 404. The same text file also can't be accessed when put in the images folder. From what I'm looking at, it doesn't seem to be a permissions issue. Any ideas?

    Read the article

  • Scalable (half-million files) version control system

    - by hashable
    We use SVN for our source-code revision control and are experimenting using it for non-source-code files. We are working with a large set (300-500k) of short (1-4kB) text files that will be updated on a regular basis and need to version control it. We tried using SVN in flat-file mode and it is struggling to handle the first commit (500k files checked in) taking about 36 hours. On a daily basis, we need the system to be able to handle 10k modified files per commit transaction in a short time (<5 min). My questions: Is SVN the right solution for my purpose. The initial speed seems too slow for practical use. If Yes, is there a particular svn server implementation that is fast? (We are currently using the gnu/linux default svn server and command line client.) If No, what are the best f/oss/commercial alternatives Thanks

    Read the article

  • EF 4.x generated entity classes (POCO) and Map files

    - by JBeckton
    I have an MVC 4 app that I am working on and using the code first implementation except I cheated a bit and created my database first then generated my entity classes (poco) from my database using the EF power tools (reverse engineer). I guess you can say I did database first method but I have no edmx file just the context class and my entity classes (poco) I have a few projects in the works using MVC and EF with pocos but just the one project I used the tool to generate my pocos from the database. My question is about the mapping files that get created when I generate my pocos using the tool. What is the purpose of these Map files? I figured the map files are needed when generating the db from the model like with the true code first method, in my case where I am using a tool to generate my model from the database do the map files have any influence on how my app uses the entity classes?

    Read the article

  • Compile multiple C files with make

    - by Mohit Deshpande
    (I am running Linux Ubuntu 9.10, so the extension for an executable is executablefile.out) I am just getting into modular programming (programming with multiple files) in C and I want to know how to compile multiple files in a single makefile. For example, what would be the makefile to compile these files: main.c, dbAdapter.c, dbAdapter.h? (By the way, If you haven't figured it out yet, the main function is in main.c) Also could someone post a link to the documentation of a makefile?

    Read the article

  • Makefile option/rule to handle missing/removed source files

    - by b3nj1
    http://stackoverflow.com/questions/239004/need-a-makefile-dependency-rule-that-can-handle-missing-files gives some pointers on how to handle removed source files for generating .o files. I'm using gcc/g++, so adding the -MP option when generating dependencies works great for me, until I get to the link stage with my .a file... What about updating archives/libraries when input sources go away? This works OK for me, but is there a cleaner way (ie, something as straightforward as the g++ -MP option)? #BUILD_DIR is my target directory (includes Debug/Release and target arch) #SRC_OUTS are my .o files LIBATLS_HAS = $(shell nm ${BUILD_DIR}/libatls.a | grep ${BUILD_DIR} | sed -e 's/.*(//' -e 's/).*://') LIBATLS_REMOVE = $(filter-out $(notdir ${SRC_OUTS}), ${LIBATLS_HAS}) ${BUILD_DIR}/libatls.a: ${BUILD_DIR}/libatls.a(${SRC_OUTS}) ifneq ($(strip ${LIBATLS_REMOVE}),) $(AR) -d $@ ${LIBATLS_REMOVE} endif

    Read the article

  • Prevent Chrome from automatically opening downloaded PDF and Image files

    - by Phoenix
    When I download a PDF or image in Google Chrome on my Mac, is it possible to prevent Chrome from automatically opening it in my default application for that file type (e.g., Preview)? I notice that Chrome does not do this for other downloaded files such as audio and ZIP archives. I still want to be able to preview files in Chrome; I just want to prevent it from automatically launching my image/PDF viewer application after I download them. For example: I click on a link in an email to a PDF document or an image file. Chrome displays the contents in the browser. I press Cmd-S and save the file to my computer. When the download finishes, the file opens automatically in Preview.app. It's that last step that I would like to bypass.

    Read the article

  • Storage of various linux config files

    - by stantona
    I'm using git to track/store all my various config files required for linux. They're organized as if they live in my home directory, eg: .Xresources .config/ Awesome rc.lua .xmodmap .zshrc vim/ <- submodule emacs/ <- submodule etc I use git submodules for other things like vim/emacs configuration (since I also want to keep those separate repos). I'm thinking of creating a shell script to create the various links to these files. The goal is to make it easier to setup another linux painlessly. Is this a reasonable idea? Is there a preferred approach? I'm mostly interested in hearing how others people store their configs.

    Read the article

  • How restore qmail backup files

    - by Maysam
    We are using qmail as our mail application on a linux server. A few weeks ago our server crashed and we had everything installed from scratch and our users started to send & receive email again. The problem is they have lost their old emails. We have a back up of the whole qmail directory. But I don't know how to restore the old emails without losing the new ones. It's worth mentioning that I don't have any problem with restoring old sent mails. When I copy email files into .sent-mail/cur directory, I have them restored in sent box of users, but restoring files in /cur directory doesn't work for inbox emails and I can't get them restored.

    Read the article

  • Cisco NAC: help with enabling FTP or moving update files

    - by kyoung
    Hi, So this is a LINUX question, and a Cisco NAC question. I'm trying to update our server from 4.1 to 4.7, and i need to move some tarball files to the NAC. the NAC Appliance runs some strange stripped down version of Fedora Core 4 copying the upgrade: The instructions say to FTP the file to the NAC appliance, however whenever i use WinSCP with root credentials, i get a notice informing me the connection was actively refused. I can't for the life of me find any .conf files that sound like winners, so I don't know how to change the settings, however the ftp command does seem to work. what exactly should I do here?

    Read the article

  • Replace files with symlink

    - by soandos
    This question is intended to be the inverse of Replace Symbolic Links with Files, but for windows. I have started running out of space on my SSD drive, and I found that about 12% of used space is in my installer folder (holds the .msi files for all the programs that I have installed) I am looking for two things: A way to move this (or any) folder via symlink. Ideally, some powershell function that I could use to just designate a folder, a destination, and the symlink would be created in the original (pointing to the destination) In this particular case, a registry change that would allow the location to be move would also be helpful, but I would still prefer solution 1. How can this be done?

    Read the article

  • How do I list all the files for a commit in git

    - by Philip Fourie
    I need to write a script that retrieves all files that were committed for a given SHA1. I have difficulty getting a nice formatted list of all files that were part of the commit. I have tried: git show a303aa90779efdd2f6b9d90693e2cbbbe4613c1d Although listing the files it also includes additional diff information that I don't need. I am hoping there is a simple git command that will provide such a list without me having to parse it from the above command.

    Read the article

< Previous Page | 102 103 104 105 106 107 108 109 110 111 112 113  | Next Page >