Search Results

Search found 45804 results on 1833 pages for 'large files'.

Page 42/1833 | < Previous Page | 38 39 40 41 42 43 44 45 46 47 48 49 | Next Page >

Manage Large E-Book Archive

- by Cnkt

I have a very large e-book archive (approx. 1TB) including various file formats eg. PDF, DJVU, MOBI and EPUB. I put them in different folders by subject eg. Engineering, Programming etc. But after many years, things are going crazy. The programming folder itself is 220GB and file names are cryptic. Some filenames are well defined like: 236659889_Final_Report_of_2012_Climate_Change_Conference.pdf but some filenames are just ISBN numbers or just download.pdf. I need an application for organizing and searching my e-books. I already tried Calibre, Mendeley and Debenu. But all these apps try to import files first and I dont have any spare 1+ TB for the apps import folder. Is there any good Windows application for just indexing filenames and contents of ebooks without importing them?

Read the article
Best archive format & tool for large amounts of data (50gb+)

- by marcusstarnes

I only realised this afternoon that the ZIP format has a limit of what appears to be around 20gb. I am trying to automate an archive process (using Automate) to zip/rar/whatever a collection of folders/files on one of my disks. It always appeared to bomb out with an incomplete archive at about 20gb. So I tried using WinRAR and doing it manually as a ZIP file, but it told me of the limit. So, I was wondering, what is a recommended zip format (and tool for accomplishing the task) for archiving up a large amount of data (around 50gb)? Thanks

Read the article
Large volume at /mnt on AWS instance

- by rhaag71

I know this is probably a somewhat 'dumb' question :) I have an AWS (small) instance and I just noticed that there is a ~150gb volume attached at /mnt, is this normal? It kinda freaked me out, I was thinking maybe someone was trying to capture whatever I mount in /mnt, there is the entry in my fstab too (and I found that others have this by googling)... the entry is as follows /dev/xvdb /mnt auto defaults,nobootwait,comment=cloudconfig 0 2 I don't have any volumes this large in my AWS volumes section though. I was just trying to understand this and be sure that someone is not trying to 'get in'... as there are many attempts daily. Thanks

Read the article
Archive format & tool for large amounts of data (50gb+)

- by marcusstarnes

I only realised this afternoon that the ZIP format has a limit of what appears to be around 20gb. I am trying to automate an archive process (using Automate) to zip/rar/whatever a collection of folders/files on one of my disks. It always appeared to bomb out with an incomplete archive at about 20gb. So I tried using WinRAR and doing it manually as a ZIP file, but it told me of the limit. So, I was wondering, what is a recommended zip format (and tool for accomplishing the task) for archiving up a large amount of data (around 50gb)?

Read the article
How would I batch rename a lot of files using command-line?

- by Whisperity

I have a problem which I am unable to solve: I need to rename a great dump of files using patterns. I tried using this, but I always get an error. I have a folder, inside with a lot of files. Running ls -1 | wc -l, it returns that I have like 160000 files inside. The problem is, that I wish to move these files to a Windows system, but most of them have characters like : and ? in them, which makes the file unaccessible on said Windows-based systems. (As a "do not solve but deal with" method, I tried booting up a LiveCD on the Windows system and moving the files using the live OS. Under that Ubuntu, the files were readable and writable on the mounted NTFS partition, but when I booted back on Windows, it showed that the file is there but Windows was unable to access it in any fashion: rename, delete or open.) I tried running rename 's/\:/_' * inside the folder, but I got Argument list too long error. Some search revealed that it happens because I have so many files, and then I arrived here. The problem is that I don't know how to alter the command to suit my needs, as I always end up having various errors like Trying find -name '*:*' | xargs rename : _, it gives xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option [\n] syntax error at (eval 1) line 1, near ":" [\n] xargs: rename: exited with status 255; aborting Adding the -0 after xargs turns the error message to xargs: argument line too long These files are archive files generated by various PHP scripts. The best solution would be having a chance to rename them before they are moved to Windows, but if there is no way to do it, we might have a way to rename the files while they are moved to Windows. I use samba and proftpd to move the files. Unfortunately, graphical software are out of the question as the server containing the files is what it is, a server, with only command-line interface.

Read the article
Is there a way to identify that a file has been modified and moved?

- by Eric

I'm writing an application that catalogs files, and attributes them with extra meta data through separate "side-car" files. If changes to the files are made through my program then it is able to keep everything in sync between them and their corresponding meta data files. However, I'm trying to figure out a way to deal with someone modifying the files manually while my program is not running. When my program starts up it scans the file system and compares the files it finds to it's previous record of what files it remembers being there. It's fairly straight forward to update after a file has been deleted or added. However, if a file was moved or renamed then my program sees that as the old file being deleted, and the new file being added. Yet I don't want to loose the association between the file and its metadata. I was thinking I could store a hash from each file so I could check to see if newly found files were really previously known files that had been moved or renamed. However, if the file is both moved/renamed and modified then the hash would not match either. So is there some other unique identifier of a file that I can track which stays with it even after it is renamed, moved, or modified?

Read the article
Web Site Performance and Assembly Versioning – Part 3 Versioning Combined Files Using Mercurial

- by capgpilk

Minification and Concatination of JavaScript and CSS Files Versioning Combined Files Using Subversion Versioning Combined Files Using Mercurial – this post I have worked on a project recently where there was a need to version the system (library dll, css and javascript files) by date and Mercurial revision number. This was in the format:- 0.12.524.407 {major}.{year}.{month}{date}.{mercurial revision} Each time there is an internal build using the CI server, it would label the files using this format. When it came time to do a major release, it became v1.{year}.{month}{date}.{mercurial revision}, with each public release having a major version increment. Also as a requirement, each assembly also had to have a new GUID on each build. So like in previous posts, we need to edit the csproj file, and add a couple of Default targets. 1: <?xml version="1.0" encoding="utf-8"?> 2: <Project ToolsVersion="4.0" DefaultTargets="Hg-Revision;AssemblyInfo;Build" 3: xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> 4: <PropertyGroup> Right below the closing tag of the entire project we add our two targets, the first is to get the Mercurial revision number. We first need to import the tasks for MSBuild which can be downloaded from http://msbuildhg.codeplex.com/ 1: <Import Project="..\Tools\MSBuild.Mercurial\MSBuild.Mercurial.Tasks" /> 1: <Target Name="Hg-Revision"> 2: <HgVersion LocalPath="$(MSBuildProjectDirectory)" Timeout="5000" 3: LibraryLocation="C:\TortoiseHg\"> 4: <Output TaskParameter="Revision" PropertyName="Revision" /> 5: </HgVersion> 6: <Message Text="Last revision from HG: $(Revision)" /> 7: </Target> With the main Mercurial files being located at c:\TortoiseHg To get a valid GUID we need to escape from the csproj markup and call some c# code which we put in a property group for later reference. 1: <PropertyGroup> 2: <GuidGenFunction> 3: <![CDATA[ 4: public static string ScriptMain() { 5: return System.Guid.NewGuid().ToString().ToUpper(); 6: } 7: ]]> 8: </GuidGenFunction> 9: </PropertyGroup> Now we add in our target for generating the GUID. 1: <Target Name="AssemblyInfo"> 2: <Script Language="C#" Code="$(GuidGenFunction)"> 3: <Output TaskParameter="ReturnValue" PropertyName="NewGuid" /> 4: </Script> 5: <Time Format="yy"> 6: <Output TaskParameter="FormattedTime" PropertyName="year" /> 7: </Time> 8: <Time Format="Mdd"> 9: <Output TaskParameter="FormattedTime" PropertyName="daymonth" /> 10: </Time> 11: <AssemblyInfo CodeLanguage="CS" OutputFile="Properties\AssemblyInfo.cs" 12: AssemblyTitle="name" AssemblyDescription="description" 13: AssemblyCompany="none" AssemblyProduct="product" 14: AssemblyCopyright="Copyright ©" 15: ComVisible="false" CLSCompliant="true" Guid="$(NewGuid)" 16: AssemblyVersion="$(Major).$(year).$(daymonth).$(Revision)" 17: AssemblyFileVersion="$(Major).$(year).$(daymonth).$(Revision)" /> 18: </Target> So this will give use an AssemblyInfo.cs file like this just prior to calling the Build task:- 1: using System; 2: using System.Reflection; 3: using System.Runtime.CompilerServices; 4: using System.Runtime.InteropServices; 5: 6: [assembly: AssemblyTitle("name")] 7: [assembly: AssemblyDescription("description")] 8: [assembly: AssemblyCompany("none")] 9: [assembly: AssemblyProduct("product")] 10: [assembly: AssemblyCopyright("Copyright ©")] 11: [assembly: ComVisible(false)] 12: [assembly: CLSCompliant(true)] 13: [assembly: Guid("9C2C130E-40EF-4A20-B7AC-A23BA4B5F2B7")] 14: [assembly: AssemblyVersion("0.12.524.407")] 15: [assembly: AssemblyFileVersion("0.12.524.407")] Therefore giving us the correct version for the assembly. This can be referenced within your project whether web or Windows based like this:- 1: public static string AppVersion() 2: { 3: return Assembly.GetExecutingAssembly().GetName().Version.ToString(); 4: } As mentioned in previous posts in this series, you can label css and javascript files using this version number and the GetAssemblyIdentity task from the main MSBuild task library build into the .Net framework. 1: <GetAssemblyIdentity AssemblyFiles="bin\TheAssemblyFile.dll"> 2: <Output TaskParameter="Assemblies" ItemName="MyAssemblyIdentities" /> 3: </GetAssemblyIdentity> Then use this to write out the files:- 1: <WriteLinestoFile 2: File="Client\site-style-%(MyAssemblyIdentities.Version).combined.min.css" 3: Lines="@(CSSLinesSite)" Overwrite="true" />

Read the article
6 Ways to Free Up Hard Drive Space Used by Windows System Files

- by Chris Hoffman

We’ve previously covered the standard ways to free up space on Windows. But if you have a small solid-state drive and really want more hard space, there are geekier ways to reclaim hard drive space. Not all of these tips are recommended — in fact, if you have more than enough hard drive space, following these tips may actually be a bad idea. There’s a tradeoff to changing all of these settings. Erase Windows Update Uninstall Files Windows allows you to uninstall patches you install from Windows Update. This is helpful if an update ever causes a problem — but how often do you need to uninstall an update, anyway? And will you really ever need to uninstall updates you’ve installed several years ago? These uninstall files are probably just wasting space on your hard drive. A recent update released for Windows 7 allows you to erase Windows Update files from the Windows Disk Cleanup tool. Open Disk Cleanup, click Clean up system files, check the Windows Update Cleanup option, and click OK. If you don’t see this option, run Windows Update and install the available updates. Remove the Recovery Partition Windows computers generally come with recovery partitions that allow you to reset your computer back to its factory default state without juggling discs. The recovery partition allows you to reinstall Windows or use the Refresh and Reset your PC features. These partitions take up a lot of space as they need to contain a complete system image. On Microsoft’s Surface Pro, the recovery partition takes up about 8-10 GB. On other computers, it may be even larger as it needs to contain all the bloatware the manufacturer included. Windows 8 makes it easy to copy the recovery partition to removable media and remove it from your hard drive. If you do this, you’ll need to insert the removable media whenever you want to refresh or reset your PC. On older Windows 7 computers, you could delete the recovery partition using a partition manager — but ensure you have recovery media ready if you ever need to install Windows. If you prefer to install Windows from scratch instead of using your manufacturer’s recovery partition, you can just insert a standard Window disc if you ever want to reinstall Windows. Disable the Hibernation File Windows creates a hidden hibernation file at C:\hiberfil.sys. Whenever you hibernate the computer, Windows saves the contents of your RAM to the hibernation file and shuts down the computer. When it boots up again, it reads the contents of the file into memory and restores your computer to the state it was in. As this file needs to contain much of the contents of your RAM, it’s 75% of the size of your installed RAM. If you have 12 GB of memory, that means this file takes about 9 GB of space. On a laptop, you probably don’t want to disable hibernation. However, if you have a desktop with a small solid-state drive, you may want to disable hibernation to recover the space. When you disable hibernation, Windows will delete the hibernation file. You can’t move this file off the system drive, as it needs to be on C:\ so Windows can read it at boot. Note that this file and the paging file are marked as “protected operating system files” and aren’t visible by default. Shrink the Paging File The Windows paging file, also known as the page file, is a file Windows uses if your computer’s available RAM ever fills up. Windows will then “page out” data to disk, ensuring there’s always available memory for applications — even if there isn’t enough physical RAM. The paging file is located at C:\pagefile.sys by default. You can shrink it or disable it if you’re really crunched for space, but we don’t recommend disabling it as that can cause problems if your computer ever needs some paging space. On our computer with 12 GB of RAM, the paging file takes up 12 GB of hard drive space by default. If you have a lot of RAM, you can certainly decrease the size — we’d probably be fine with 2 GB or even less. However, this depends on the programs you use and how much memory they require. The paging file can also be moved to another drive — for example, you could move it from a small SSD to a slower, larger hard drive. It will be slower if Windows ever needs to use the paging file, but it won’t use important SSD space. Configure System Restore Windows seems to use about 10 GB of hard drive space for “System Protection” by default. This space is used for System Restore snapshots, allowing you to restore previous versions of system files if you ever run into a system problem. If you need to free up space, you could reduce the amount of space allocated to system restore or even disable it entirely. Of course, if you disable it entirely, you’ll be unable to use system restore if you ever need it. You’d have to reinstall Windows, perform a Refresh or Reset, or fix any problems manually. Tweak Your Windows Installer Disc Want to really start stripping down Windows, ripping out components that are installed by default? You can do this with a tool designed for modifying Windows installer discs, such as WinReducer for Windows 8 or RT Se7en Lite for Windows 7. These tools allow you to create a customized installation disc, slipstreaming in updates and configuring default options. You can also use them to remove components from the Windows disc, shrinking the size of the resulting Windows installation. This isn’t recommended as you could cause problems with your Windows installation by removing important features. But it’s certainly an option if you want to make Windows as tiny as possible. Most Windows users can benefit from removing Windows Update uninstallation files, so it’s good to see that Microsoft finally gave Windows 7 users the ability to quickly and easily erase these files. However, if you have more than enough hard drive space, you should probably leave well enough alone and let Windows manage the rest of these settings on its own. Image Credit: Yutaka Tsutano on Flickr

Read the article
iptables management tools for large scale environment

- by womble

The environment I'm operating in is a large-scale web hosting operation (several hundred servers under management, almost-all-public addressing, etc -- so anything that talks about managing ADSL links is unlikely to work well), and we're looking for something that will be comfortable managing both the core ruleset (around 12,000 entries in iptables at current count) plus the host-based rulesets we manage for customers. Our core router ruleset changes a few times a day, and the host-based rulesets would change maybe 50 times a month (across all the servers, so maybe one change per five servers per month). We're currently using filtergen (which is balls in general, and super-balls at our scale of operation), and I've used shorewall in the past at other jobs (which would be preferable to filtergen, but I figure there's got to be something out there that's better than that). The "musts" we've come up with for any replacement system are: Must generate a ruleset fairly quickly (a filtergen run on our ruleset takes 15-20 minutes; this is just insane) -- this is related to the next point: Must generate an iptables-restore style file and load that in one hit, not call iptables for every rule insert Must not take down the firewall for an extended period while the ruleset reloads (again, this is a consequence of the above point) Must support IPv6 (we aren't deploying anything new that isn't IPv6 compatible) Must be DFSG-free Must use plain-text configuration files (as we run everything through revision control, and using standard Unix text-manipulation tools are our SOP) Must support both RedHat and Debian (packaged preferred, but at the very least mustn't be overtly hostile to either distro's standards) Must support the ability to run arbitrary iptables commands to support features that aren't part of the system's "native language" Anything that doesn't meet all these criteria will not be considered. The following are our "nice to haves": Should support config file "fragments" (that is, you can drop a pile of files in a directory and say to the firewall "include everything in this directory in the ruleset"; we use configuration management extensively and would like to use this feature to provide service-specific rules automatically) Should support raw tables Should allow you to specify particular ICMP in both incoming packets and REJECT rules Should gracefully support hostnames that resolve to more than one IP address (we've been caught by this one a few times with filtergen; it's a rather royal pain in the butt) The more optional/weird iptables features that the tool supports (either natively or via existing or easily-writable plugins) the better. We use strange features of iptables now and then, and the more of those that "just work", the better for everyone.

Read the article
Windows 2003 Storage Server Hanging on Large File Transfers

- by user25272

In one of our offices we have a Dell PowerVault 745N NAS device which acts as the main file server. Its running 32bit Windows 2003 Storage Server SP2 with 3GB RAM. The server holds around 60 users HOME folders, which are mapped via AD. The office clients are a mix of XP SP3, Vista and Windows 7. Occasionally the server will completely hang when transferring large files. When the hang happens the console becomes unresponsive with only the mouse active and blank wallpaper. Sometimes stopping the copy frees the server, sometimes not. The hanging can last around 20 minutes. During this time other servers also become unresponsive with blank wallpaper at the console. If you do manage to get onto another server the taskbar and run commands are unresponsive. This also transcends to the client computers sometimes with explorer crashing. I'm guessing this is due to the HOME folder mapping. Eventually the NAS server with free up and everything will be back to normal. The server is configured as follows: PERC 4/DC DATA 2 - 12 SCSI HDD - RAID5 SHADOWCOPY 2 SCSI HDD - RAID1 CERC SATA DATA 11 4 SATA HDD - RAID5 OS 4 SATA HDD - RAID5 All the drivers and firmware is up to date. I've been through all the diagnostics with Dell and the hardware has come up clean including full HDD tests on the arrays. The server has NOD32 installed as the AV, but the hanging happens when it is uninstalled. There are no errors in the event log when this happens and we don't have any errors logged on any of our ProCurve switches. DNS is fine on the domain and AD from what I can tell is running happily. There are no DFS or NFS shares setup either. All the shares are standard Windows. I've unchecked the allow the computer to turn off this device to save power box under Power Management on the NIC. "Set Link Speed and Duplex to Auto-negotiate 1000 " Increased Receive Descriptors buffer from 256 to 352 (reserves more CPU resource for handling data) I've run network traces using network monitor and have found the following: 417 8.078125 {SMB:192, NbtSS:25, TCP:24, IPv4:23} 192.168.2.244 192.168.5.35 SMB SMB:R; Nt Create Andx - NT Status: System - Error, Code = (52) STATUS_OBJECT_NAME_NOT_FOUND I've tried different cabling; NICs and switch ports all with the same result. Transferring files from other servers on the domain is fine. All I haven't done is run CHKDSK on the drives to look for any file system errors. On the Vista clients I have also run netsh interface tcp set global autotuning=disabled with no result. Could it be that the server has a faulty drive or that the I/O is too much for it to handle? Any ideas why would the hang cause issues with the other servers on the LAN? Many Thanks.

Read the article
method for transfering large files for newbies

- by doug

Hi there One of my friends is now in china and he wana send me his home-mode video files. I have a linux hosting account on godaddy and i've configured a ftp account for him. Unfortunately he has trouble in using the ftp account. Can you recommend a better option? TY

Read the article
Copying large number of files from one directory to another in linux

- by Ritesh Sharma

Hi all, I've a directory containing around 2.8 lacs of files. I want to move them to another directory. If I use 'cp' or 'mv' then I get an error 'argument list too long'. If I write a script like for file in ls *; do cp {source} to {destination} done then because of 'ls' command , its performance degrades. How can I do this?

Read the article
IIS large amount of time before loading

- by Lukes123

I am running an ASP.NET 3.5 website on IIS 6 with Server 2003. Whenever I modify any of the ASPX files, any page on the site then takes about 2-3 minutes before it starts to load. Even the smallest modification causes this to happen. Why is this?

Read the article
What is your favorite way to read XML files using C#?

- by stacker

Let's take this xml structure as example: <?xml version="1.0" encoding="utf-8"?> <Configuration-content> <XFile Name="file name 1" /> <XFile Name="name2" /> <XFile Name="name3" /> <XFile Name="name4" /> </Configuration-content> public class Configuration { public XFile[] Files { get; set; } } public interface IConfigurationRipository { Configuration Get(); void Save(Configuration entity); } I wonder what's the best way to do that. The task is to implement IConfigurationRipository using your favorite approach.

Read the article
Using ServletOutputStream to write very large files in a Java servlet without memory issues

- by Martin

I am using IBM Websphere Application Server v6 and Java 1.4 and am trying to write large CSV files to the ServletOutputStream for a user to download. Files are ranging from a 50-750MB at the moment. The smaller files aren't causing too much of a problem but with the larger files it appears that it is being written into the heap which is then causing an OutOfMemory error and bringing down the entire server. These files can only be served out to authenticated users over https which is why I am serving them through a Servlet instead of just sticking them in Apache. The code I am using is (some fluff removed around this): resp.setHeader("Content-length", "" + fileLength); resp.setContentType("application/vnd.ms-excel"); resp.setHeader("Content-Disposition","attachment; filename=\"export.csv\""); FileInputStream inputStream = null; try { inputStream = new FileInputStream(path); byte[] buffer = new byte[1024]; int bytesRead = 0; do { bytesRead = inputStream.read(buffer, offset, buffer.length); resp.getOutputStream().write(buffer, 0, bytesRead); } while (bytesRead == buffer.length); resp.getOutputStream().flush(); } finally { if(inputStream != null) inputStream.close(); } The FileInputStream doesn't seem to be causing a problem as if I write to another file or just remove the write completly the memory usage doesn't appear to be a problem. What I am thinking is that the resp.getOutputStream().write is being stored in memory until the data can be sent through to the client. So the entire file might be read and stored in the resp.getOutputStream() causing my memory issues and crashing! I have tried Buffering these streams and also tried using Channels from java.nio, none of which seems to make any bit of difference to my memory issues. I have also flushed the outputstream once per iteration of the loop and after the loop, which didn't help.

Read the article
NHibernate - Stream large result sets?

- by Dan Black

Hi, I have to read in a large record set, process it, then write it out to a flat file. The large result set comes from a Stored Proc in SQL 2000. I currently have: var results = session.CreateSQLQuery("exec usp_SalesExtract").List(); I would like to be able to read the result set row by row, to reduce the memory foot print Thanks

Read the article
Ruby Large HTML getting error, limit to header size

- by Joe Stein

def mailTo(subject,msg,folks) begin Net::SMTP.start('localhost', 25) do |smtp| smtp.send_message "MIME-Version: 1.0\nContent-type: text/html\nSubject: #{subject}\n#{msg}\n#{DateTime.now}\n", '[email protected]', folks end rescue => e puts "Emailing Sending Error - #{e}" end end when the HTML is VERY large I get this exception Emailing Sending Error - 552 5.6.0 Headers too large (32768 max) how can i get a larger html above max to work with Net::SMTP in Ruby

Read the article
Ruby Large HTML emails getting error, limit to header size

- by Joe Stein

def mailTo(subject,msg,folks) begin Net::SMTP.start('localhost', 25) do |smtp| smtp.send_message "MIME-Version: 1.0\nContent-type: text/html\nSubject: #{subject}\n#{msg}\n#{DateTime.now}\n", '[email protected]', folks end rescue => e puts "Emailing Sending Error - #{e}" end end when the HTML is VERY large I get this exception Emailing Sending Error - 552 5.6.0 Headers too large (32768 max) how can i get a larger html above max to work with Net::SMTP in Ruby

Read the article
writing large excel spreadsheets

- by pstanton

has anybody found a library that works well with large spreadsheets? I've tried apache's POI but it fails miserably working with large files - both reading and writing. It uses massive amounts of memory leaving you needing a supercomputer to parse or create a 20+mb spreadsheet. Surely there is a more memory efficient way and someone has written it?!

Read the article
Efficient file buffering & scanning methods for large files in python

- by eblume

The description of the problem I am having is a bit complicated, and I will err on the side of providing more complete information. For the impatient, here is the briefest way I can summarize it: What is the fastest (least execution time) way to split a text file in to ALL (overlapping) substrings of size N (bound N, eg 36) while throwing out newline characters. I am writing a module which parses files in the FASTA ascii-based genome format. These files comprise what is known as the 'hg18' human reference genome, which you can download from the UCSC genome browser (go slugs!) if you like. As you will notice, the genome files are composed of chr[1..22].fa and chr[XY].fa, as well as a set of other small files which are not used in this module. Several modules already exist for parsing FASTA files, such as BioPython's SeqIO. (Sorry, I'd post a link, but I don't have the points to do so yet.) Unfortunately, every module I've been able to find doesn't do the specific operation I am trying to do. My module needs to split the genome data ('CAGTACGTCAGACTATACGGAGCTA' could be a line, for instance) in to every single overlapping N-length substring. Let me give an example using a very small file (the actual chromosome files are between 355 and 20 million characters long) and N=8 import cStringIO example_file = cStringIO.StringIO("""\ header CAGTcag TFgcACF """) for read in parse(example_file): ... print read ... CAGTCAGTF AGTCAGTFG GTCAGTFGC TCAGTFGCA CAGTFGCAC AGTFGCACF The function that I found had the absolute best performance from the methods I could think of is this: def parse(file): size = 8 # of course in my code this is a function argument file.readline() # skip past the header buffer = '' for line in file: buffer += line.rstrip().upper() while len(buffer) = size: yield buffer[:size] buffer = buffer[1:] This works, but unfortunately it still takes about 1.5 hours (see note below) to parse the human genome this way. Perhaps this is the very best I am going to see with this method (a complete code refactor might be in order, but I'd like to avoid it as this approach has some very specific advantages in other areas of the code), but I thought I would turn this over to the community. Thanks! Note, this time includes a lot of extra calculation, such as computing the opposing strand read and doing hashtable lookups on a hash of approximately 5G in size. Post-answer conclusion: It turns out that using fileobj.read() and then manipulating the resulting string (string.replace(), etc.) took relatively little time and memory compared to the remainder of the program, and so I used that approach. Thanks everyone!

Read the article
In Sinatra, how can I serve static index.html files in subdirectories in public folder?

- by socrateos

I noticed that Sinatra does not recognize index.html files in public folder's subdirectories and returns an error when url is pointing to a directory without specifiying the file name. For example, if user enters a url like "www.mydomain.com/subdiretory/", Sinatra fails to recognize the existence of an index.html file in that directory. There are hundreds of subdirectories in my public folder so that it is impossible to specify each one of them in code (and the number of subdirectories keeps growing). How can I tell Sinatra to leave my web server (Apache) alone (to server index.html file) if there is an index.html file in a subdirectory of public folder when url is pointing to that directory without the file name?

Read the article
C++ program to calculate large factorials

- by xbonez

How can I write a c++ program to calculate large factorials. Example, if I want to calculate (100!) / (99!), we know the answer is 100, but if i calculate the factorials of the numerator and denominator individually, both the numbers are gigantically large.

Read the article
Error in connecting Eclipse to SQL Server

- by user3721900

This is the syntax error Jun 10, 2014 5:15:51 PM org.apache.catalina.core.AprLifecycleListener init INFO: The APR based Apache Tomcat Native library which allows optimal performance in production environments was not found on the java.library.path: C:\Program Files (x86)\Java\jre7\bin;C:\Windows\Sun\Java\bin;C:\Windows\system32;C:\Windows;C:/Program Files (x86)/Java/jre7/bin/client;C:/Program Files (x86)/Java/jre7/bin;C:/Program Files (x86)/Java/jre7/lib/i386;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files (x86)\Intel\iCLS Client\;C:\Program Files\Intel\iCLS Client\;C:\Program Files (x86)\AMD APP\bin\x86_64;C:\Program Files (x86)\AMD APP\bin\x86;C:\Program Files\Common Files\Microsoft Shared\Windows Live;C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\Windows Live\Shared;C:\Program Files (x86)\ATI Technologies\ATI.ACE\Core-Static;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\Microsoft\Web Platform Installer\;C:\Program Files (x86)\Microsoft ASP.NET\ASP.NET Web Pages\v1.0\;C:\Program Files (x86)\Windows Kits\8.0\Windows Performance Toolkit\;C:\Program Files\Microsoft SQL Server\110\Tools\Binn\;C:\Program Files (x86)\Microchip\MPLAB C32 Suite\bin;C:\Program Files\Java\jdk1.7.0_25\bin;C:\Program Files (x86)\Java\jdk1.7.0_03\bin;c:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\;c:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\;c:\Program Files\Microsoft SQL Server\100\Tools\Binn\;c:\Program Files (x86)\Microsoft SQL Server\100\DTS\Binn\;c:\Program Files\Microsoft SQL Server\100\DTS\Binn\;C:\Program Files (x86)\Google\google_appengine\;C:\Users\Patrick\Desktop\2013-2014 2nd Sem Files\Eclipsee\eclipse;;. Jun 10, 2014 5:15:51 PM org.apache.tomcat.util.digester.SetPropertiesRule begin WARNING: [SetPropertiesRule]{Server/Service/Engine/Host/Context} Setting property 'source' to 'org.eclipse.jst.jee.server:B2B' did not find a matching property. Jun 10, 2014 5:15:51 PM org.apache.coyote.AbstractProtocol init INFO: Initializing ProtocolHandler ["http-bio-8080"] Jun 10, 2014 5:15:51 PM org.apache.coyote.AbstractProtocol init INFO: Initializing ProtocolHandler ["ajp-bio-8009"] Jun 10, 2014 5:15:51 PM org.apache.catalina.startup.Catalina load INFO: Initialization processed in 544 ms Jun 10, 2014 5:15:51 PM org.apache.catalina.core.StandardService startInternal INFO: Starting service Catalina Jun 10, 2014 5:15:51 PM org.apache.catalina.core.StandardEngine startInternal INFO: Starting Servlet Engine: Apache Tomcat/7.0.42 Jun 10, 2014 5:15:52 PM org.apache.coyote.AbstractProtocol start INFO: Starting ProtocolHandler ["http-bio-8080"] Jun 10, 2014 5:15:52 PM org.apache.coyote.AbstractProtocol start INFO: Starting ProtocolHandler ["ajp-bio-8009"] Jun 10, 2014 5:15:52 PM org.apache.catalina.startup.Catalina start INFO: Server startup in 374 ms com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax near '`'. This is my code package b2b.fishermall; public class ConnectionString extends SqlStringCommands { public String getDriver(){ return "com.microsoft.sqlserver.jdbc.SQLServerDriver"; } public String getURL() { return "jdbc:sqlserver://localhost:1433;databaseName=B2B;integratedSecurity=true;"; } public String getUsername() { return ""; } public String getDbPassword() { return ""; } }

Read the article
Regex to catch all files but those starting with "."

- by tmslnz

In a directory with mixed content such as: .afile .anotherfile bfile.file bnotherfile.file .afolder/ .anotherfolder/ bfolder/ bnotherfolder/ How would you catch everything but the files (not dirs) starting with .? I have tried with a negative lookahead ^(?!\.).+? but it doesn't seem to work right. Please note that I would like to avoid doing it by excluding the . by using [a-zA-Z< plus all other possible chars minus the dot >] Any suggestions?

Read the article
Sorting a very large text file in Java

- by Alice

Hi, I have a large text file I need to sort in Java. The format is: word [tab] frequency [new line] The algorithm for sorting is: Read some of the file, filtering for purlely alphabetic words. Once you have X number of alphabetic words, call Collections.sort and write the result to a file. Repeat until you have finished reading the file. Start reading two sorted files, comparing line by line for the word with higher frequency, and writing at the same time to a new file as to not load much into your memory Repeat until all files are merged into one large file Right now I've divided the large file into smaller ones (sorted by descending frequency) with 10,000 lines each. I know I need to somehow merge these files back together, but I'm not sure how to go about this. I've created a LinkedList to keep track of all the files created. The algorithm says to compare each line in the two files, except I've tried a case where , say file1 = 8,6,5,3,1 and file2 = 9,8,8,8,8. Then if I compare them line by line I would get file3 = 9,8,8,6,8,5,8,3,8,1 which is incorrectly sorted (they should be in decreasing order). I think I'm misunderstanding some part of the algorithm. If someone could point out what I should do instead, I'd greatly appreciate it. Thanks. edit: Yes this is an assignment. We aren't allowed to increase memory unfortunately :(

Read the article

Search Results

Search found 45804 results on 1833 pages for 'large files'.

Page 42/1833 | < Previous Page | 38 39 40 41 42 43 44 45 46 47 48 49 | Next Page >

- by Cnkt

- by marcusstarnes

- by rhaag71

- by marcusstarnes

- by Whisperity

- by Eric

- by capgpilk

- by Chris Hoffman

- by womble

- by user25272

- by doug

- by Ritesh Sharma

- by Lukes123

- by stacker

- by Martin

- by Dan Black

- by Joe Stein

- by Joe Stein

- by pstanton

- by eblume

- by socrateos

- by xbonez

- by user3721900

- by tmslnz

- by Alice

< Previous Page | 38 39 40 41 42 43 44 45 46 47 48 49 | Next Page >