large files - Page 624 - Developer IT

"Android Create" call fails in windows 7 - missing JDK

- by reuscam

I'm having a problem getting my android dev environment setup in Windows 7. I follow the instructions here, as well as several environment sublinks. I am using Eclipse with the Android plugin. I have installed the Java JDK several times, in various locations (jdk-6u20-windows-i586.exe) - but I am obviously missing something. Every time I run "android create avd --target 2 --name my_avd" I get an error: C:\Users\andrew>android create avd --target 2 --name my_avd WARNING: Java not found in your path. Checking it it's installed in C:\Program Files\Java instead. ERROR: No suitable Java found. In order to properly use the Android Developer Tools, you need a suitable version of Java installed on your system. We recommend that you install the JDK version of JavaSE, available here: http://java.sun.com/javase/downloads/ You can find the complete Android SDK requirements here: http://developer.android.com/sdk/requirements.html This error message is the reason for me installing the JDK several times over. First I tried installing to a location on my e: drive. I then moved it to the default loc (program files (x86)\java\jdk.6.something. I also tried forcing it to go into the program files\ path, but it still automatically installs into the (x86) path. I have added the install path to my path environment variable every single time, yet I still continue to get this error. My suspicion is that windows 7 and the android tools are not playing together well in terms of finding the JDK, but who knows, it may be something entirely different. If you have seen this error before, I would appreciate a hint.

Read the article

How to restrict a content of string to less than 4MB and save that string in DB using C#

- by Pranay B

I'm working on a project where I need to get the Text data from pdf files and dump the whole text in a DB column. With the help of iTextsharp, I got the data and referred it String. But now I need to check whether the string exceeds the 4MB limit or not and if it is exceeding then accept the string data which is less than 4MB in size. This is my code: internal string ReadPdfFiles() { // variable to store file path string filePath = null; // open dialog box to select file OpenFileDialog file = new OpenFileDialog(); // dilog box title name file.Title = "Select Pdf File"; //files to be accepted by the user. file.Filter = "Pdf file (*.pdf)|*.pdf|All files (*.*)|*.*"; // set initial directory of computer system file.InitialDirectory = Environment.GetFolderPath(Environment.SpecialFolder.Desktop); // set restore directory file.RestoreDirectory = true; // execute if block when dialog result box click ok button if (file.ShowDialog() == DialogResult.OK) { // store selected file path filePath = file.FileName.ToString(); } //file path /// use a string array and pass all the pdf for searching //String filePath = @"D:\Pranay\Documentation\Working on SSAS.pdf"; try { //creating an instance of PdfReader class using (PdfReader reader = new PdfReader(filePath)) { //creating an instance of StringBuilder class StringBuilder text = new StringBuilder(); //use loop to specify how many pages to read. //I started from 5th page as Piyush told for (int i = 5; i <= reader.NumberOfPages; i++) { //Read the pdf text.Append(PdfTextExtractor.GetTextFromPage(reader, i)); }//end of for(i) int k = 4096000; //Test whether the string exceeds the 4MB if (text.Length < k) { //return the string text1 = text.ToString(); } //end of if } //end of using } //end try catch (Exception ex) { MessageBox.Show(ex.Message, "Please Do select a pdf file!!", MessageBoxButtons.OK, MessageBoxIcon.Warning); } //end of catch return text1; } //end of ReadPdfFiles() method Do help me!

Read the article

asp.net mvc stand alone ascx control how do i link (css and js) most efficiently

- by Julian

Hi, I need some advice. I have developed some asp.net mvc web pages. Each page has a master and some ascx controls (between 2 - 6) embedded into it a js and css file. Up to now every thing was fine. In order to improve modularity, flexibility and testability the ascx's are now expected to be able to work as stand alone controls. (Each ascx has also got its own css and js files in some cases it has another control inside it) In order to meet this requirement we call the controller with the relevant parameters and it returns the ascx (partial) directly to the browser without all of the other parts of the original page . In order to get it to display correctly (css) and act correctly (js/jquery) all of the relevant files need to be added (as links or scripts eg. href="<%= ResolveUrl(styleSheet)%>") to the user control. This is "contradicting" the concept of positioning the files at the most logical place (could be the master page for example). How can I overcome this problem? Keep in mind that this is relevant for each "control" ascx file. Any thoughts will be appreciated.

Read the article

I did my own web framework: now, how keep it sync with applications? must I use versions?

- by Daniel Koch

... and I did the first web application using it, now I'm going to create the second. In this first web application I enhanced the framework's core library with new things and promptly updated framework branch. I'm using bazaar to keep framework and web application committed. The application was in the beginning, a full branch of framework source tree, now I'm updating framework manually at every change on core files. (copying changed files from web app to framework's branch). With this second web application that I'm going to create, I need to know about versions (or revisions) which the application is based. If I found a bug in this version I can fix and then sync files with first web application no worrying: functions will be the same to this application. If I'm going to make changes in core (new behavior, new functions in library or something new in source tree) it must be named as "new version". What's the best way to do this? Because I'm using a Distributed Version Control System (bazaar), I'm not dealing with VERSIONS, but revision numbers that change every time. Please fresh my mind with new ideas.

Read the article

AppDomain.Unload doesn't release the assembly I loaded up with Reflection

Hi All, I am struggling with an issue while loading an assembly up in a temporary AppDomain to read its GetUsedReferences property. Once I do that, I call AppDomain.Unload(tempDomain) and then I try to clean up my mess by deleting the files. That fails because the file is locked. I Unloaded the temporary domain though! Any thoughts or suggestions would be greately appreciated. Here is some of my code: //I already have btyes for the .dll and the .pdb from the actual files AppDomainSetup domainSetup = new AppDomainSetup(); domainSetup.ApplicationBase = Environment.CurrentDirectory; domainSetup.ShadowCopyFiles = "true"; domainSetup.CachePath = Environment.CurrentDirectory; AppDomain tempAppDomain = AppDomain.CreateDomain("TempAppDomain", AppDomain.CurrentDomain.Evidence, domainSetup); //Load up the temp assembly and do stuff Assembly projectAssembly = tempAppDomain.Load(assemblyFileBuffer, symbolsFileBuffer); //Then I'm trying to clean up AppDomain.Unload(tempAppDomain); tempAppDomain = null; File.Delete(tempAssemblyFile); //I even try to force GC File.Delete(tempSymbolsFile); Anyway, the Deletes fail because the files are locked still. Shouldn't they be released because I Unloaded the temporary AppDomain?!?!?! Thanks in advance, Dan

Read the article

Create folder and insert file in Google Drive

- by web_student

I am trying to create a new folder in Drive and upload one (or more) files to that created folder. I use the code below, but the result is that both the folder and the file are placed in the root of my Drive. $client->setAccessToken($_SESSION['accessToken']); //create folder $folder_mime = "application/vnd.google-apps.folder"; $folder_name = 'New Folder'; $service = new Google_DriveService($client); $folder = new Google_DriveFile(); $folder->setTitle($folder_name); $folder->setMimeType($folder_mime); $service->files->insert($folder); //upload file $file_name = $_FILES["uploadFile"]["name"]; $file_mime = $_FILES["uploadFile"]["type"]; $file_path = $_FILES["uploadFile"]["tmp_name"]; $service = new Google_DriveService($client); $file = new Google_DriveFile(); $file->setParents(array($folder_name)); $file->setTitle($file_name); $file->setDescription('This is a '.$file_mime.' document'); $file->setMimeType($file_mime); $service->files->insert( $file, array( 'data' => file_get_contents($file_path) ) );

Read the article

Moving to an arbitrary position in a file in Python

- by B Rivera

Let's say that I routinely have to work with files with an unknown, but large, number of lines. Each line contains a set of integers (space, comma, semicolon, or some non-numeric character is the delimiter) in the closed interval [0, R], where R can be arbitrarily large. The number of integers on each line can be variable. Often times I get the same number of integers on each line, but occasionally I have lines with unequal sets of numbers. Suppose I want to go to Nth line in the file and retrieve the Kth number on that line (and assume that the inputs N and K are valid --- that is, I am not worried about bad inputs). How do I go about doing this efficiently in Python 3.1.2 for Windows? I do not want to traverse the file line by line. I tried using mmap, but while poking around here on SO, I learned that that's probably not the best solution on a 32-bit build because of the 4GB limit. And in truth, I couldn't really figure out how to simply move N lines away from my current position. If I can at least just "jump" to the Nth line then I can use .split() and grab the Kth integer that way. The nuance here is that I don't just need to grab one line from the file. I will need to grab several lines: they are not necessarily all near each other, the order in which I get them matters, and the order is not always based on some deterministic function. Any ideas? I hope this is enough information. Thanks!

Read the article

Got Hacked. Want to understand how.

- by gaoshan88

Someone has, for the second time, appended a chunk of javascript to a site I help run. This javascript hijacks Google adsense, inserting their own account number, and sticking ads all over. The code is always appended, always in one specific directory (one used by a third party ad program), affects a number of files in a number of directories inside this one ad dir (20 or so) and is inserted at roughly the same overnight time. The adsense account belongs to a Chinese website (located in a town not an hour from where I will be in China next month. Maybe I should go bust heads... kidding, sort of), btw. So, how could they append text to these files? Is it related to the permissions set on the files (ranging from 755 to 644)? To the webserver user (it's on MediaTemple so it should be secure, yes?)? I mean, if you have a file that has permissions set to 777 I still can't just add code to it at will... how might they be doing this?

Read the article

how to synchronize database table and directory with php

- by twmulloy

hello, I have a directory with files and a database table with what should be the same files. I would like to be able to synchronize the database table with the directory. What would be the most efficient way to do this? or would I realistically only be able to do this in a brute manner? Here's my approach: 1. retrieve all of the files in the directory as array 2. retrieve all of the filenames in the database table as array 3. loop through the file values in the directory array and use in_array() on the database table array to verify the filename is in that array, and if not then start building an array to insert the missing filenames. run db query to add each missing file row to database table 4. loop through directory array and use in_array() on the directory array and anything not found in the directory array will just be deleted from the table. Is there a better way to go about this? or something better for this in php than in_array()?

Read the article

How can I generate an "unlimited" world?

- by snowlord

I would like to create a game with an endless (in reality an extremely large) world in which the player can move about. Whether or not I will ever get around to implement the game is one matter, but I find the idea interesting and would like some input on how to do it. The point is to have a world where all data is generated randomly on-demand, but in a deterministic way. Currently I focus on a large 2D map from which it should be possible to display any part without knowledge about the surrounding parts. I have implemented a prototype by writing a function that gives a random-looking, but deterministic, integer given the x and y of a pixel on the map (see my recent question about this function). Using this function I populate the map with "random" values, and then I smooth the map using a simple filter based on the surrounding pixels. This makes the map dependent on a few pixels outside its edge, but that's not a big problem. The final result is something that at least looks like a map (especially with a good altitude color map). Given this, one could maybe first generate a coarser map which is used to generate bigger differences in altitude to create mountain ranges and seas. Anyway, that was my idea, but I am sure that there exist ways to do this already and I also believe that given the specification, many of you can come up with better ideas. EDIT: Forgot the link to my question.

Read the article

LINQ Joins - Performance

- by Meiscooldude

I am curious on how exactly LINQ (not LINQ to SQL) is performing is joins behind the scenes in relation to how Sql Server performs joins. Sql Server before executing a query, generates an Execution Plan. The Execution Plan is basically an Expression Tree on what it believes is the best way to execute the query. Each node provides information on whether to do a Sort, Scan, Select, Join, ect. On a 'Join' node in our execution plan, we can see three possible algorithms; Hash Join, Merge Join, and Nested Loops Join. Sql Server will choose which algorithm to for each Join operation based on expected number of rows in Inner and Outer tables, what type of join we are doing (some algorithms don't support all types of joins), whether we need data ordered, and probably many other factors. Join Algorithms: Nested Loop Join: Best for small inputs, can be optimized with ordered inner table. Merge Join: Best for medium to large inputs sorted inputs, or an output that needs to be ordered. Hash Join: Best for medium to large inputs, can be parallelized to scale linearly. LINQ Query: DataTable firstTable, secondTable; ... var rows = from firstRow in firstTable.AsEnumerable () join secondRow in secondTable.AsEnumerable () on firstRow.Field<object> (randomObject.Property) equals secondRow.Field<object> (randomObject.Property) select new {firstRow, secondRow}; SQL Query: SELECT * FROM firstTable fT INNER JOIN secondTable sT ON fT.Property = sT.Property Sql Server might use a Nested Loop Join if it knows there are a small number of rows from each table, a merge join if it knows one of the tables has an index, and Hash join if it knows there are a lot of rows on either table and neither has an index. Does Linq choose its algorithm for joins? or does it always use one?

Read the article

JVM GC demote object to eden space?

- by Kevin

I'm guessing this isn't possible...but here goes. My understanding is that eden space is cheaper to collect than old gen space, especially when you start getting into very large heaps. Large heaps tend to come up with long running applications (server apps) and server apps a lot of the time want to use some kind of caches. Caches with some kind of eviction (LRU) tend to defeat some assumptions that GC makes (temporary objects die quickly). So cache evictions end up filling up old gen faster than you'd like and you end up with a more costly old gen collection. Now, it seems like this sort of thing could be avoided if java provided a way to mark a reference as about to die (delete keyword)? The difference between this and c++ is that the use is optional. And calling delete does not actually delete the object, but rather is a hint to the GC that it should demote the object back to Eden space (where it will be more easily collected). I'm guessing this feature doesn't exist, but, why not (is there a reason it's a bad idea)?

Read the article

Fast way to manually mod a number

- by Nikolai Mushegian

I need to be able to calculate (a^b) % c for very large values of a and b (which individually are pushing limit and which cause overflow errors when you try to calculate a^b). For small enough numbers, using the identity (a^b)%c = (a%c)^b%c works, but if c is too large this doesn't really help. I wrote a loop to do the mod operation manually, one a at a time: private static long no_Overflow_Mod(ulong num_base, ulong num_exponent, ulong mod) { long answer = 1; for (int x = 0; x < num_exponent; x++) { answer = (answer * num_base) % mod; } return answer; } but this takes a very long time. Is there any simple and fast way to do this operation without actually having to take a to the power of b AND without using time-consuming loops? If all else fails, I can make a bool array to represent a huge data type and figure out how to do this with bitwise operators, but there has to be a better way.

Read the article

Ideas Needed for a Base Code System

- by Tegan Snyder

I've developed a PHP web application that is currently in need of a strategic restructuring. Currently when we setup new clients we give them the entire code base on a subdomain of our main domain and create a new table for them in the database. This results in each client having the entire codebase, meaning when we make bug changes, fixes we have to go back and apply them independently across all clients and this is a pain. What I'd like to create is a base code server that holds all the core PHP files. base.domain.com Then all of our clients (client.domain.com) will only need a few files: config.php would have the database connection information. index.php - displays the login box if session non-existant, otherwise it loads baseline code via remote includes to base.domain.com. My question is does my logic seem feasible? How do other people handle similar situations by having a base code? Also.... Is it even possbile to remotely include PHP files from base.domain.com and include them in client.domain.com? Thanks, Tegan

Read the article

How to fix error - "@interface interfaceName : someEnumeration" gives error "cannot find interface '

- by Paul V

How can I solve "cannot find interface declaration 'someEnumeration', superclass of 'interfaceName'" error? What steps will reproduce the problem? Compiling Wsdl2ObjC Targeting groupwise.wsdl file Fixing non-valid file names of output csource code like ".h" + ".m" and objects inside source files Moving up one of the @interface BEFORE it was used futher in code! What is the expected output? Something working What do you see instead? 33 errors. "Inherited" from only 3 similar Inheritances of a typedef enum object by a class. All errors are typical: typedef enum types_StatusTrackingOptions { types_StatusTrackingOptions_none = 0, types_StatusTrackingOptions_None, types_StatusTrackingOptions_Delivered, types_StatusTrackingOptions_DeliveredAndOpened, types_StatusTrackingOptions_All, } types_StatusTrackingOptions; types_StatusTrackingOptions types_StatusTrackingOptions_enumFromString(NSString *string); NSString * types_StatusTrackingOptions_stringFromEnum(types_StatusTrackingOptions enumValue); @interface types_StatusTracking : types_StatusTrackingOptions { ... and here I'm having error "cannot find interface declaration for 'types_StatusTrackingOptions', superclass of 'types_StatusTracking'". What version of the product are you using? On what operating system? Wsdl2ObjC - rev 168, OS - Mac OS X 10.6.2, iPhone SDK - 3.2, Simulator - v. 3.1.2 - 3.1.3, wsdl - for GroupWise v.8, NDK released 2008-12-23, wsdl and xsd files are attached. P.S. GroupWise.wsdl + .xsd files could be downloaded from http://code.google.com/p/wsdl2objc/issues/detail?id=99

Read the article

Are bad data issues that common?

- by Water Cooler v2

I've worked for clients that had a large number of distinct, small to mid-sized projects, each interacting with each other via properly defined interfaces to share data, but not reading and writing to the same database. Each had their own separate database, their own cache, their own file servers/system that they had dedicated access to, and so they never caused any problems. One of these clients is a mobile content vendor, so they're lucky in a way that they do not have to face the same problems that everyday business applications do. They can create all those separate compartments where their components happily live in isolation of the others. However, for many business applications, this is not possible. I've worked with a few clients, one of whose applications I am doing the production support for, where there are "bad data issues" on an hourly basis. Yeah, it's that crazy. Some data records from one of the instances (lower than production, of course) would have been run a couple of weeks ago, and caused some other user's data to get corrupted. And then, a data script will have to be written to fix this issue. And I've seen this happening so much with this client that I have to ask. I've seen this happening at a moderate rate with other clients, but this one just seems to be out of order. If you're working with business applications that share a large amount of data by reading and writing to/from the same database, are "bad data issues" that common in your environment?

Read the article

The maximum row size for the used table type, not counting BLOBs, is 65535. You have to change some columns to TEXT or BLOBs

- by Matthew Chambers

Hello I am getting the below message on a table i am trying to create The maximum row size for the used table type, not counting BLOBs, is 65535. You have to change some columns to TEXT or BLOBs Anyone know the answer to this please -- Table warrington_central.job -- ----------------------------------------------------- CREATE TABLE IF NOT EXISTS warrington_central.job ( id MEDIUMINT(8) UNSIGNED NOT NULL AUTO_INCREMENT , alias_title VARCHAR(255) NOT NULL , reference_number VARCHAR(100) NOT NULL , title VARCHAR(255) NOT NULL , primary_category SMALLINT(5) UNSIGNED NOT NULL , secondary_category SMALLINT(5) UNSIGNED NOT NULL , tertiary_category SMALLINT(5) UNSIGNED NULL , address_id BIGINT(20) UNSIGNED NOT NULL , geolocation_id BIGINT(20) UNSIGNED NULL , company VARCHAR(255) NOT NULL , description VARCHAR(10000) NOT NULL , skills_required VARCHAR(10000) NOT NULL , job_type TINYINT(2) UNSIGNED NOT NULL , experience_months_required TINYINT(2) UNSIGNED NOT NULL , experience_years_required TINYINT(2) UNSIGNED NOT NULL , salary_range VARCHAR(30) NOT NULL , extra_benefits_above_salary VARCHAR(500) NOT NULL , available_from DATE NULL , available_to DATE NULL , extra_location_details VARCHAR(1000) NOT NULL , contact_email VARCHAR(100) NOT NULL , contact_phone_number VARCHAR(20) NOT NULL , contact_mobile_number VARCHAR(20) NOT NULL , terms_conditions_application VARCHAR(5000) NOT NULL , link_to_profile ENUM('0','1') NOT NULL , created_on DATETIME NOT NULL , updated_on DATETIME NOT NULL , updated_by BIGINT(20) UNSIGNED NOT NULL , add_contact_form ENUM('0','1') NOT NULL , admin_package_id TINYINT(1) UNSIGNED NOT NULL , package_start_date DATETIME NOT NULL , package_end_date DATETIME NULL , package_comment VARCHAR(500) NOT NULL , viewable_to_members_only ENUM('0','1') NOT NULL , advertise_to DATETIME NULL , show_comment ENUM('0','1') NOT NULL , hits BIGINT(20) UNSIGNED NOT NULL DEFAULT 0 , visible ENUM('0','1') NOT NULL DEFAULT '0' , approved ENUM('I/* large SQL query (3.9 KB), snipped at 2,000 characters / / SQL Error (1118): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. You have to change some columns to TEXT or BLOBs */ SHOW WARNINGS;

Read the article

Is XSLT worth investing time in and are there any actual alternatives?

- by Keeno

I realize this has been a few other questions on this topic, and people are saying use your language of choice to manipulate the XML etc etc however, not quite fit my question exactly. Firstly, the scope of the project: We want to develop platform independent e-learning, currently, its a bunch of HTML pages but as they grow and develop they become hard to maintain. The idea: Generate up an XML file + Schema, then produce some XSLT files that process the XML into the eLearning modiles. XML to HTML via XSLT. Why: We would like the flexibilty to be able to easy reformat the content (I realize CSS is a viable alternative here) If we decide to alter the pages layout or functionality in anyway, im guessing altering the "shared" XSLT files would be easier than updating the HTML files. So far, we have about 30 modules, with up to 10-30 pages each Depending on some "parameters" we could output drastically different page layouts/structures, above and beyond what CSS can do Now, all this has to be platform independent, and to be able to run "offline" i.e. without a server powering the HTML Negatives I've read so far for XSLT: Overhead? Not exactly sure why...is it the compute power need to convert to HTML? Difficult to learn Better alternatives Now, what I would like to know exactly is: are there actually any viable alternatives for this "offline"? Am I going about it in the correct manner, do you guys have any advice or alternatives. Thanks!

Read the article

Ways of breaking down SQL transactional/call data into reports -- 'square data'?

- by RizwanK

I've got a large database of call-traffic information (although the question could be answered with any generic data set.) For instance, a row contains : call endpoint server (endpoint_name) call endpoint status (sip_disconnect_reason) call destination (destination) call completed (duration) [duration 0 is completed] call account group (account_group) It's pretty easy to run SQL reports against the data, i.e. select count(*), endpoint_name from calls where duration0 group by endpoint_name select count(*),destination from calls where blah group by destination I've been calling this filtering or breakdown reports (I get the number of calls per carrier, etc.). Add another breakdown, and you've got two breakdowns, a la select count(*), endpoint_name, sip_disconnect_reason from calls where duration=0 group by endpoint_name, sip_disconnect_reason Of course, if you keep adding breakdowns, you end up making super-large reports and slicing your data so thin that you can't extract any trends from it. So my question is this : Is there a name for this sort of method of report writing? (I've heard words like squares, slicing and breakdown reports applied to them) --- I'm looking for a Python/Reporting toolkit that I can use to make these easier to generate for my end users. aside : Are there other ways of representing transactional data that might be useful rather than the above method? Thanks,

Read the article

Request size limitation when using MultipartHttpServletRequest of Spring 3.0

- by Spiderman

I'd like to know what is the size limitation if I upload list of files in one client's form submition using HTTP multipart content type. On the server side I am using Spring's MultipartHttpServletRequest to handle the request. mM questions: Is there should be different file size limitation and total request size limitation or file size is the only limitation and the request is capable of uploading 100s of files as lonng as they are not too large. Doest the Spring request wrapper read the complete request and store it in the JAVA heap memory or it store temporaray files of it to be able to use big quota. Is the use of reading the httpservlet request in streaming would change the size limitation than using complete http request read at-once by the application server. What is the bottleneck of this process - Java heap size, the quota of the filesystem on which my web-server runs, the maximum allowed BLOB size that the DataBase in which I am gonna save the file alows? or Spring internal limitations? Related threads that still don't have exact answer to this: does-spring-framework-support-streaming-mode-in-mutlipart-requests is-there-a-way-to-get-raw-http-request-stream-from-java-servlet-handler how-to- drop-body-of-a-request-after-checking-headers-in-servlet apache-commons-fileupload-throws-malformedstreamexception

Read the article

regular expression code

- by Gaia Andreoletti

Deal all, I need to find match between two tab delimited files files like this: File 1: ID1 1 65383896 65383896 G C PCNXL3 ID1 2 56788990 55678900 T A ACT1 ID1 1 56788990 55678900 T A PRO55 File 2 ID2 34 65383896 65383896 G C MET5 ID2 2 56788990 55678900 T A ACT1 ID2 2 56788990 55678900 T A HLA what I would like to do is to retrive the matching line between the two file. What I would like to match is everyting after the gene ID So far I have written this code but unfortunately perl keeps giving me the error: use of "Use of uninitialized value in pattern match (m//)" Could you please help me figure out where i am doing it wrong? Thank you in advance! use strict; open (INA, $ARGV[0]) || die "cannot to open gene file"; open (INB, $ARGV[1]) || die "cannot to open coding_annotated.var files"; my @sample1 = <INA>; my @sample2 = <INB>; foreach my $line (@sample1) { my @tab = split (/\t/, $line); my $chr = $tab[1]; my $start = $tab[2]; my $end = $tab[3]; my $ref = $tab[4]; my $alt = $tab[5]; my $name = $tab[6]; foreach my $item (@sample2){ my @fields = split (/\t/,$item); if ($fields[1]=~ m/$chr(.*)/ && $fields[2]=~ m/$start(.*)/ && $fields[4]=~ m/$ref(.*)/ && $fields[5]=~ m/$alt(.*)/&& $fields[6]=~ m/$name(.*)/){ print $line,"\n",$item; } } }

Read the article

Splitting a set of object into several subsets of 'similar' objects

- by doublep

Suppose I have a set of objects, S. There is an algorithm f that, given a set S builds certain data structure D on it: f(S) = D. If S is large and/or contains vastly different objects, D becomes large, to the point of being unusable (i.e. not fitting in allotted memory). To overcome this, I split S into several non-intersecting subsets: S = S1 + S2 + ... + Sn and build Di for each subset. Using n structures is less efficient than using one, but at least this way I can fit into memory constraints. Since size of f(S) grows faster than S itself, combined size of Di is much less than size of D. However, it is still desirable to reduce n, i.e. the number of subsets; or reduce the combined size of Di. For this, I need to split S in such a way that each Si contains "similar" objects, because then f will produce a smaller output structure if input objects are "similar enough" to each other. The problems is that while "similarity" of objects in S and size of f(S) do correlate, there is no way to compute the latter other than just evaluating f(S), and f is not quite fast. Algorithm I have currently is to iteratively add each next object from S into one of Si, so that this results in the least possible (at this stage) increase in combined Di size: for x in S: i = such i that size(f(Si + {x})) - size(f(Si)) is min Si = Si + {x} This gives practically useful results, but certainly pretty far from optimum (i.e. the minimal possible combined size). Also, this is slow. To speed up somewhat, I compute size(f(Si + {x})) - size(f(Si)) only for those i where x is "similar enough" to objects already in Si. Is there any standard approach to such kinds of problems? I know of branch and bounds algorithm family, but it cannot be applied here because it would be prohibitively slow. My guess is that it is simply not possible to compute optimal distribution of S into Si in reasonable time. But is there some common iteratively improving algorithm?

Read the article

How can I accelerate the generation of the an MD5 Checksum within vb.net?

- by Richard

I'm working with some very large files residing on P2 (Panasonic) cards. Part of the process we employ is to first generate a checksum of the file we are going to copy, then copy the file, then run a checksum on the file to confirm that it copied OK. The problem is, is that files are large (70 GB+) and take a long time to complete. It's an issue since we will eventually be dealing with thousands of these files. I would like to find a faster way to generate the checksum other than using the System.Security.Cryptography.MD5CryptoServiceProvider I don't care if this means using a specialized hardware card, provided it works and is not to ungodly expensive. I would prefer to have a method of encoding that provided some feedback as to how far the process has gone along so I can display it like I do now. The application is written in vb.net. I would prefer to be able to use it as component, library, reference within my application, but I'm willing to call an outside application if there is enough improvement in the speed of generating the checksum. Needless to say, the checksum must be consistent and correct. :-) Thank you in advance for your time and efforts, Richard

Read the article

Complex SQL design, help/advice needed

- by eugeneK

Hi, i have few questions for SQL gurus in here ... Briefly this is ads management system where user can define campaigns for different countries, categories, languages. I have few questions in mind so help me with what you can. Generally i'm using ASP.NET and i want to cache all result set of certain user once he asks for statistics for the first time, this way i will avoid large round-trips to server. any help is welcomed Click here for diagram with all details you need for my questions 1.Main issue of this application is to show to the user how many clicks/impressions were and how much money he spent on campaign. What is the easiest way to get this information for him? I will also include filtering by date, date ranges and few other params in this statistics table. 2.Other issue is what happens when user will try to edit campaign. Old campaign will die this means if user set 0.01$ as campaignPPU (pay-per-unit) and next day updates it to 0.05$ all will be reset to 0.05$. 3.If you could re-design some parts of table design so it would be more flexible and easier to modify, how would you do it? Thanks... sorry for so large job but it may interest some SQL guys in here

Read the article

MySQL query puzzle - finding what WOULD have been the most recent date

- by Hank

I've looked all over and haven't yet found an intelligent way to handle this, though I feel sure one is possible: One table of historical data has quarterly information: CREATE TABLE Quarterly ( unique_ID INT UNSIGNED NOT NULL, date_posted DATE NOT NULL, datasource TINYINT UNSIGNED NOT NULL, data FLOAT NOT NULL, PRIMARY KEY (unique_ID)); Another table of historical data (which is very large) contains daily information: CREATE TABLE Daily ( unique_ID INT UNSIGNED NOT NULL, date_posted DATE NOT NULL, datasource TINYINT UNSIGNED NOT NULL, data FLOAT NOT NULL, qtr_ID INT UNSIGNED, PRIMARY KEY (unique_ID)); The qtr_ID field is not part of the feed of daily data that populated the database - instead, I need to retroactively populate the qtr_ID field in the Daily table with the Quarterly.unique_ID row ID, using what would have been the most recent quarterly data on that Daily.date_posted for that data source. For example, if the quarterly data is 101 2009-03-31 1 4.5 102 2009-06-30 1 4.4 103 2009-03-31 2 7.6 104 2009-06-30 2 7.7 105 2009-09-30 1 4.7 and the daily data is 1001 2009-07-14 1 3.5 ?? 1002 2009-07-15 1 3.4 && 1003 2009-07-14 2 2.3 ^^ then we would want the ?? qtr_ID field to be assigned '102' as the most recent quarter for that data source on that date, and && would also be '102', and ^^ would be '104'. The challenges include that both tables (particularly the daily table) are actually very large, they can't be normalized to get rid of the repetitive dates or otherwise optimized, and for certain daily entries there is no preceding quarterly entry. I have tried a variety of joins, using datediff (where the challenge is finding the minimum value of datediff greater than zero), and other attempts but nothing is working for me - usually my syntax is breaking somewhere. Any ideas welcome - I'll execute any basic ideas or concepts and report back.

Search Results

Search found 45804 results on 1833 pages for 'large files'.

Page 624/1833 | < Previous Page | 620 621 622 623 624 625 626 627 628 629 630 631 | Next Page >

- by reuscam

- by Pranay B

- by Julian

- by Daniel Koch

- by web_student

- by B Rivera

- by gaoshan88

- by twmulloy

- by snowlord

- by Meiscooldude

- by Kevin

- by Nikolai Mushegian

- by Tegan Snyder

- by Paul V

- by Water Cooler v2

- by Matthew Chambers

- by Keeno

- by RizwanK

- by Spiderman

- by Gaia Andreoletti

- by doublep

- by Richard

- by eugeneK

- by Hank

< Previous Page | 620 621 622 623 624 625 626 627 628 629 630 631 | Next Page >