Search Results

Search found 11409 results on 457 pages for 'large teams'.

Page 39/457 | < Previous Page | 35 36 37 38 39 40 41 42 43 44 45 46  | Next Page >

  • Building dictionary of words from large text

    - by LiorH
    I have a text file containing posts in English/Italian. I would like to read the posts into a data matrix so that each row represents a post and each column a word. The cells in the matrix are the counts of how many times each word appears in the post. The dictionary should consist of all the words in the whole file or a non exhaustive English/Italian dictionary. I know this is a common essential preprocessing step for NLP. Does anyone know of a tool\project that can perform this task? Someone mentioned apache lucene, do you know if lucene index can be serialized to a data-structure similar to my needs?

    Read the article

  • Octave: importing a large matrix in csv format

    - by Massagran
    I'm trying to import a matrix (about 80.000 rows) from a csv file to Octave. The obvious solution seems something like: load("-ascii","relative_directory/the_file.csv") or maybe renaming the file and trying: load("-ascii", "relative_directory/the_file.txt") Yet I keep getting the error: load: failed to read matrix from file "relative_directory/the_file.csv" or .txt without anymore details. Any tips are appreciated.

    Read the article

  • Append a large li to ul: best way?

    - by zsharp
    I have a li that has numerous nested divs. I am appending to a ul as follows: $("ul#List").append('<li><div>....many more nested divs...</li>'); the structure of the li is the same as the other lis in ul but i have to modify some elements. My question is simply am I doing it wrong by manually writing out the entire structure?

    Read the article

  • A function where small changes in input always result in large changes in output

    - by snowlord
    I would like an algorithm for a function that takes n integers and returns one integer. For small changes in the input, the resulting integer should vary greatly. Even though I've taken a number of courses in math, I have not used that knowledge very much and now I need some help... An important property of this function should be that if it is used with coordinate pairs as input and the result is plotted (as a grayscale value for example) on an image, any repeating patterns should only be visible if the image is very big. I have experimented with various algorithms for pseudo-random numbers with little success and finally it struck me that md5 almost meets my criteria, except that it is not for numbers (at least not from what I know). That resulted in something like this Python prototype (for n = 2, it could easily be changed to take a list of integers of course): import hashlib def uniqnum(x, y): return int(hashlib.md5(str(x) + ',' + str(y)).hexdigest()[-6:], 16) But obviously it feels wrong to go over strings when both input and output are integers. What would be a good replacement for this implementation (in pseudo-code, python, or whatever language)?

    Read the article

  • PHP readfile() and large downloads

    - by Nirmal
    While setting up an online file management system, and now I have hit a block. I am trying to push the file to the client using this modified version of readfile: function readfile_chunked($filename,$retbytes=true) { $chunksize = 1*(1024*1024); // how many bytes per chunk $buffer = ''; $cnt =0; // $handle = fopen($filename, 'rb'); $handle = fopen($filename, 'rb'); if ($handle === false) { return false; } while (!feof($handle)) { $buffer = fread($handle, $chunksize); echo $buffer; ob_flush(); flush(); if ($retbytes) { $cnt += strlen($buffer); } } $status = fclose($handle); if ($retbytes && $status) { return $cnt; // return num. bytes delivered like readfile() does. } return $status; } But when I try to download a 13 MB file, it's just breaking at 4 MB. What would be the issue here? It's definitely not the time limit of any kind because I am working on a local network and speed is not an issue. The memory limit in PHP is set to 300 MB. Thank you for any help.

    Read the article

  • git-svn on subset of large svn repo

    - by an146
    repo layout: a/1 a/2 a/3 ... b/1 b/2 ... c/1 c/2 ... git-svn works perfect for me if I work on 1 svn repo subdir. But right now I'm facing the need to work on several subdirs (like, a/1, a/2, and b/1), and there's much shit in repo besides them. I've managed to write a regexp for this, but git-svn with --ignore-paths seems to check each file's name against this regexp, instead of skipping entire folders, so it's too slow. /* Probably I should file a bug report about this */ So -- any ideas of handling this? If some Mercurial svn agent can do selective clones, it's OK too, but I'd better stick with git. My another idea was some selective svn proxy, but I haven't succeeded in googling anything like that. Thanks!

    Read the article

  • Need to check uptime on a large file being hosted

    - by trustfundbaby
    I have a dynamically generated rss feed that is about 150M in size (don't ask) The problem is that it keeps crapping out sporadically and there is no way to monitor it without downloading the entire feed to get a 200 status. Pingdom times out on it and returns a 'down' error. So my question is, how do I check that this thing is up and running

    Read the article

  • Decompressing a very large serialized object and managing memory

    - by Mike_G
    I have an object that contains tons of data used for reports. In order to get this object from the server to the client I first serialize the object in a memory stream, then compress it using the Gzip stream of .NET. I then send the compressed object as a byte[] to the client. The problem is on some clients, when they get the byte[] and try to decompress and deserialize the object, a System.OutOfMemory exception is thrown. Ive read that this exception can be caused by new() a bunch of objects, or holding on to a bunch of strings. Both of these are happening during the deserialization process. So my question is: How do I prevent the exception (any good strategies)? The client needs all of the data, and ive trimmed down the number of strings as much as i can. edit: here is the code i am using to serialize/compress (implemented as extension methods) public static byte[] SerializeObject<T>(this object obj, T serializer) where T: XmlObjectSerializer { Type t = obj.GetType(); if (!Attribute.IsDefined(t, typeof(DataContractAttribute))) return null; byte[] initialBytes; using (MemoryStream stream = new MemoryStream()) { serializer.WriteObject(stream, obj); initialBytes = stream.ToArray(); } return initialBytes; } public static byte[] CompressObject<T>(this object obj, T serializer) where T : XmlObjectSerializer { Type t = obj.GetType(); if(!Attribute.IsDefined(t, typeof(DataContractAttribute))) return null; byte[] initialBytes = obj.SerializeObject(serializer); byte[] compressedBytes; using (MemoryStream stream = new MemoryStream(initialBytes)) { using (MemoryStream output = new MemoryStream()) { using (GZipStream zipper = new GZipStream(output, CompressionMode.Compress)) { Pump(stream, zipper); } compressedBytes = output.ToArray(); } } return compressedBytes; } internal static void Pump(Stream input, Stream output) { byte[] bytes = new byte[4096]; int n; while ((n = input.Read(bytes, 0, bytes.Length)) != 0) { output.Write(bytes, 0, n); } } And here is my code for decompress/deserialize: public static T DeSerializeObject<T,TU>(this byte[] serializedObject, TU deserializer) where TU: XmlObjectSerializer { using (MemoryStream stream = new MemoryStream(serializedObject)) { return (T)deserializer.ReadObject(stream); } } public static T DecompressObject<T, TU>(this byte[] compressedBytes, TU deserializer) where TU: XmlObjectSerializer { byte[] decompressedBytes; using(MemoryStream stream = new MemoryStream(compressedBytes)) { using(MemoryStream output = new MemoryStream()) { using(GZipStream zipper = new GZipStream(stream, CompressionMode.Decompress)) { ObjectExtensions.Pump(zipper, output); } decompressedBytes = output.ToArray(); } } return decompressedBytes.DeSerializeObject<T, TU>(deserializer); } The object that I am passing is a wrapper object, it just contains all the relevant objects that hold the data. The number of objects can be a lot (depending on the reports date range), but ive seen as many as 25k strings. One thing i did forget to mention is I am using WCF, and since the inner objects are passed individually through other WCF calls, I am using the DataContract serializer, and all my objects are marked with the DataContract attribute.

    Read the article

  • How to manage a large dataset using Spring MySQL and RowCallbackHandler

    - by rmarimon
    I'm trying to go over each row of a table in MySQL using Spring and a JdbcTemplate. If I'm not mistaken this should be as simple as: JdbcTemplate template = new JdbcTemplate(datasource); template.setFetchSize(1); // template.setFetchSize(Integer.MIN_VALUE) does not work either template.query("SELECT * FROM cdr", new RowCallbackHandler() { public void processRow(ResultSet rs) throws SQLException { System.out.println(rs.getString("src")); } }); I get an OutOfMemoryError because it is trying to read the whole thing. Any ideas?

    Read the article

  • Prefilling large volumes of body text in GMAIL compose getting a Request URI too long error

    - by Ali
    Hi guys this is a followup from the question: http://stackoverflow.com/questions/2583928/prefilling-gmail-compose-screen-with-html-text Where I was building a google apps application - I can call a gmail compose message page from my application using the url: https://mail.google.com/a/domain/?view=cm&fs=1&tf=1&source=mailto&to=WHOEVER%40COMPANY.COM&su=SUBJECTHERE&cc=WHOEVER%40COMPANY.COM&bcc=WHOEVER%40COMPANY.COM&body=PREPOPULATEDBODY However when I try to pass in the body parameter a very long line of text eg as a reply message body I get this error from GMAIL stating the REQUEST URI is too long. Is there a better way to do this as in a way to fillin the text body box of gmail compose section. Or some way to open the page and have it prefilled with javascript some how...

    Read the article

  • How do i make my text wrap in a <div> with a large border radius

    - by Greg Guida
    in the following code <html> <body> <div style="height:400px; width:400px; -moz-border-radius:100px; -webkit-border-radius:100px; border:3px solid #500; background-color:#a00; overflow:hidden;"> Why is this getting cut at the beginning??? </div> </body> </html> Why isn't the browser wrapping the text around the rounded corners. In webkit browsers(i tested both chrome and safari) the overflow hidden cuts the text outside the border. Firefox just renders text outside the border. I also tried this without overflow:hidden; but again the text just rendered outside the border.

    Read the article

  • Inserting Large volume of data in SQL Server 2005

    - by Manjoor
    We have a application (written in c#) to store live stock market price in the database (SQL Server 2005). It insert about 1 Million record in a single day. Now we are adding some more segment of market into it and the no of records would be double (2 Millions/day). Currently the average record insertion per second is about 50, maximum is 450 and minimum is 0. To check certain conditions i have used service broker (asynchronous trigger) on my price table. It is running fine at this time(about 35% CPU utilization). Now i am planning to create a in memory dataset of current stock price. we would like to do some simple calculations. I want to know different views of members on this. Please provide your way of dealing with such situation.

    Read the article

  • One large file or multiple small files?

    - by Dan
    I have an application (currently written in Python as we iron out the specifics but eventually it will be written in C) that makes use of individual records stored in plain text files. We can't use a database and new records will need to be manually added regularly. My question is this: would it be faster to have a single file (500k-1Mb) and have my application open, loop through, find and close a file OR would it be faster to have the records separated and named using some appropriate convention so that the application could simply loop over filenames to find the data it needs? I know my question is quite general so direction to any good articles on the topic are as appreciated as much as suggestions. Thanks very much in advance for your time, Dan

    Read the article

  • NPGSQL seems to have a rather large bug?

    - by Mr Shoubs
    Hello, this is a weird one, when I run the following code all rows are returned from the db. Imagaine what would happen if this was an update or delete. Dim cmd As New NpgsqlCommand cmd.Connection = conn cmd.CommandText = "select * FROM ac_profiles WHERE profileid = @profileId" cmd.Parameters.Add("@profile", 58) Dim dt As DataTable = DataAccess2.DataAccess.sqlQueryDb(cmd) DataGridView1.DataSource = dt My question is why is this happening?

    Read the article

  • SqlLite/Fluent NHibernate integration test harness initialization not repeatable after large data se

    - by Mark Rogers
    In one of my main data integration test harnesses I create and use Fluent NHibernate's SingleConnectionSessionSourceForSQLiteInMemoryTesting, to get a fresh session for each test. After each test, I close the connection, session, and session factory, and throw out the nested StructureMap container they came from. This works for almost any simple data integration test I can think, including ones that utilize Fluent NHib's PersistenceSpecification object. When I test the application's lengthy database bootstrapping process, which creates and saves thousands of domain objects, I start seeing issues. It's not that the setup and tear down fails, in fact, the test successfully bootstraps the in-memory database as the application would bootstrap the real database in the production environment. The problem occurs when the database bootstrapping occurs a second time on a new in-memory database, with a new session and session factory. The error is: NHibernate.StaleStateException : Unexpected row count: 0; expected: 1 The row count is indeed Unexpected, the row that the application under test is looking for should be in the session. You see, it's not that any data from the last integration test is sticking around, it's that for some reason the session just stops working mid-database-boostrap. And I've looked everywhere for a place I might be holding on to an old session and I can't find one. I've searched through the code for static singleton objects, but there are none anywhere near the code in question. I have a couple StructureMap InstanceScope singleton's but they are getting thrown out with each nested container that is lost after every test teardown. I've tried every possible variation on disposing and closing every object involved with each test teardown and it still fails on this lengthy database bootstrap. But non-bootstrap related database tests appear to work fine. I'm starting to run out of options and may have to surrender lengthy database integration tests in favor of WatiN-based acceptance tests. Can anyone give me any clue about how I can figure out why some of my SingleConnectionSessionSourceForSQLiteInMemoryTesting aren't repeatable? Any advice at all, about how to make an NHibernate SqlLite database integration test harness repeatable?

    Read the article

  • Memory Issues When DOM Parsing A Large XML File on Android Devices

    - by tonyc
    Hey awesome SO users, I have an Android application that parses an XML file for users and displays results in a much more mobile friendly format. The app works great for most users, but some users have lots and lots of data and the app crashes on them because it runs out of memory. Is there any way I have a DOM style XML parser quit parsing data after a certain amount of parsing? I only need the first 30 or so elements so it would make the application much more efficient. I'd like to use a SAX or pull parser instead, but the XML I'm parsing is not valid and I have no control over it. Unless anyone has some good SAX solutions that let me parse messy, invalid XML, I think DOM is the only way to go. Thanks for reading!

    Read the article

  • Processing potentially large STDIN data, more than once

    - by d11wtq
    I'd like to provide an accessor on a class that provides an NSInputStream for STDIN, which may be several hundred megabytes (or gigabytes, though unlikely, perhaps) of data. When I caller gets this NSInputStream it should be able to read from it without worrying about exhausting the data it contains. In other words, another block of code may request the NSInputStream and will expect to be able to read from it. Without first copying all of the data into an NSData object which (I assume) would cause memory exhaustion, what are my options for handling this? The returned NSInputStream does not have to be the same instance, it simply needs to provide the same data. The best I can come up with right now is to copy STDIN to a temporary file and then return NSInputStream instances using that file. Is this pretty much the only way to handle it? Is there anything I should be cautious of if I go the temporary file route?

    Read the article

  • DataGridView lags for a second with large data updates

    - by alexD
    I have a DataGridView with about 400 rows and 10 columns. When the user first displays this table, it receives all of the data from the server and populates the table. The DGV uses a DataTable as it's data source, and when updating the DataTable I use row.BeginEdit/EndEdit and acceptChanges, but when the View itself is updated it lags for a second while all of the DGV is being updated. I am wondering if there is a way to make this smooth, so that for example, if the user is scrolling through the data and it updates, it won't interrupt the scrolling. Or if the user is moving the display around the screen and it updates, it won't interrupt. Is there an easy way to do this? If not, is there away to prevent the DGV from updating the view until all events have ended so it won't be repainted until the user stops scrolling, dragging, etc ?

    Read the article

  • Typesetting a large matrix in LaTeX

    - by Hooked
    I have a 3x12 matrix I'd like to input into my LaTeX (with amsmath) document but LaTeX seems to choke when the matrix gets larger then 3x10: \begin{equation} \textbf{e} = \begin{bmatrix} 1&1&1&1&0&0&0&0&-1&-1&-1&-1\\ 1&-1&0&0&1&1&-1&-1&0&0&1&-1\\ 0&0&1&-1&1&-1&1&-1&1&-1&0&0 \end{bmatrix} \end{equation} The error: Extra alignment tab has been changed to \cr. tells me that I have more & then the bmatrix environment can handle. Is there a proper way to handle this? Also it seems that the alignment for 1's and the -1's are different, is that also expected of the bmatrix?

    Read the article

  • Deliver large volume of automatic notification emails without being throttled

    - by jack
    I think most website has certain needs to deliver emails to its users, e.g. account activation emails, private messsage notification, comment notification, etc. Take my site as example, among 5,000 registered users, about 1,500 signed up using gmail.com box, 1,000 using yahoo.com and another 1,000 using hotmail.com. Every now and then I receive complaints from users that they never receive account activation email, sometime it goes to junk folder sometimes it just not show in any folder. Maybe it's kind of being "throttled" when exceeded maximum number of messages sent from same ip address to gmail.com/yahoo.com/hotmail.com during certain period of time? I'm using Postfix and there seems no problem with configuration since 90% of emails can be delivered to gmail.com/yahoo.com/hotmail.com boxes successfully. I noticed twitter is delivering millions of such automatic notifications to its users but I never missed a message from them. How do they archive this? Is there a permanent white list on gmail.com, yahoo.com or hotmail.com? Thanks in advance.

    Read the article

  • Linux core dumps are too large!

    - by themoondothshine
    Hey guys, Recently I've been noticing an increase in the size of the core dumps generated by my application. Initially, they were just around 5MB in size and contained around 5 stack frames, and now I have core dumps of 2GBs and the information contained within them are no different from the smaller dumps. Is there any way I can control the size of core dumps generated? Shouldn't they be at least smaller than the application binary itself? Binaries are compiled in this way: Compiled in release mode with debug symbols (ie, -g compiler option in GCC). Debug symbols are copied onto a separate file and stripped from the binary. A GNU debug symbols link is added to the binary. At the beginning of the application, there's a call to setrlimit which sets the core limit to infinity -- Is this the problem?

    Read the article

  • Quick backup system for large projects

    - by kamziro
    I've always backed up all my source codes into .zip files and put it in my usb drive and uploaded to my server somewhere else in the world.. however I only do this once every two weeks, because my project is a little big. Right now my project directories (I have a few of them) contains a hierarchy of c++ files in it, and interspersed with them are .o files which would make backing up take a while if not ignored. What tools exist out there that will let me just back things up efficiently, conveniently and lets me specify which file types to back up (lots of .png, .jpg and some text types in there), and which directories to be ignored (esp. the build dirs)? Or is there any ingenious methods out there that people use?

    Read the article

  • Change Large Number of Record Keys using Map Table

    - by Coyote
    I have a set of records indexed by id numbers, I need to convert these record's indexes to a new id number. I have a two column table mapping the old numbers to the new numbers. For example given these two tables, what would the update statement look like? Given: OLD_TO_NEW oldid | newid ----------------- 1234 0987 7698 5645 ... ... and id | data ---------------- 1234 'yo' 7698 'hey' ... ... Need: id | data ---------------- 0987 'yo' 5645 'hey' ... ... This oracle so I have access to PL/SQL, I'm just trying to avoid it.

    Read the article

  • SQL Server: One large persisted computed column for Fulltext Indexing

    - by Alex
    It appears to me as the easiest, most straightforward solution, but please correct me if I'm wrong. Instead of having a fulltext index on all individual columns of a table, isn't it better to just generate one single wide computed column and run the fulltext index against that only? It appears to me that it gets rid of all the issues of having multiple columns, incl. that I can't search "x AND y" as this will not match a row with "x" present in column 1 and "y" present in column 2. Any counterarguments?

    Read the article

< Previous Page | 35 36 37 38 39 40 41 42 43 44 45 46  | Next Page >