Search Results

Search found 38088 results on 1524 pages for 'large scale project'.

Page 19/1524 | < Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26  | Next Page >

  • Project Idea with Hadoop MapReduce

    - by Aditya Andhalikar
    Hello, I learnt Hadoop a few months back and managed to do a very introductory programming project on it. I want to do a small - medium sized project or series of small programming assignments with Hadoop. I have seen lot of ideas around but I dont see anything that can be finished in about 60-70 hours of work so a pretty small scale project as I want to do that in my spare time along with other studies. Most project ideas I have seen sort of large to go on for 2-3 months. My main objective out of this exercise to develop good expertise in programming with Hadoop environment not to do any research or solve specific problems. I see Hadoop being used lot of with webservices maybe that would be an interesting track for small projects. Thank you in advance. Regards, Aditya

    Read the article

  • gcc/g++: error when compiling large file

    - by Alexander
    Hi, I have a auto-generated C++ source file, around 40 MB in size. It largely consists of push_back commands for some vectors and string constants that shall be pushed. When I try to compile this file, g++ exits and says that it couldn't reserve enough virtual memory (around 3 GB). Googling this problem, I found that using the command line switches --param ggc-min-expand=0 --param ggc-min-heapsize=4096 may solve the problem. They, however, only seem to work when optimization is turned on. 1) Is this really the solution that I am looking for? 2) Or is there a faster, better (compiling takes ages with these options acitvated) way to do this? Best wishes, Alexander Update: Thanks for all the good ideas. I tried most of them. Using an array instead of several push_back() operations reduced memory usage, but as the file that I was trying to compile was so big, it still crashed, only later. In a way, this behaviour is really interesting, as there is not much to optimize in such a setting -- what does the GCC do behind the scenes that costs so much memory? (I compiled with deactivating all optimizations as well and got the same results) The solution that I switched to now is reading in the original data from a binary object file that I created from the original file using objcopy. This is what I originally did not want to do, because creating the data structures in a higher-level language (in this case Perl) was more convenient than having to do this in C++. However, getting this running under Win32 was more complicated than expected. objcopy seems to generate files in the ELF format, and it seems that some of the problems I had disappeared when I manually set the output format to pe-i386. The symbols in the object file are by standard named after the file name, e.g. converting the file inbuilt_training_data.bin would result in these two symbols: binary_inbuilt_training_data_bin_start and binary_inbuilt_training_data_bin_end. I found some tutorials on the web which claim that these symbols should be declared as extern char _binary_inbuilt_training_data_bin_start;, but this does not seem to be right -- only extern char binary_inbuilt_training_data_bin_start; worked for me.

    Read the article

  • ado.net slow updating large tables

    - by brett
    The problem: 100,000+ name & address records in an access table (2003). Need to iterate through the table & update detail with the output from a 3rd party dll. I currently use ado, and it works at an acceptable speed (less than 5 minutes on a network share). We will soon need to update to access 2007 and its 'non jet' accdb format to maintain compatability with clients. I've tried using ado.net datsets, but updating the records takes hours! We process 5-10 of these tables per day - so this cannot be a solution. Any ideas on the fastest way to update individual records using ado.net? Surely we didn't take such a hugh backward step with ado.net? Any help would be appreciated.

    Read the article

  • How to deploy a project developed in Tapestry5?

    - by shane87
    I have just completed a project as part of a college degree. However I would like to deploy the project and make it live. I am unsure of how to do this as I have never done it before? I know I need to buy a domain name and some server space to host the project. If anyone can point me in the right direction that would be great? Thanks in advance!

    Read the article

  • Are large include files like iostream efficient? (C++)

    - by Keand64
    Iostream, when all of the files it includes, the files that those include, and so on and so forth, adds up to about 3000 lines. Consider the hello world program, which needs no more functionality than to print something to the screen: #include <iostream> //+3000 lines right there. int main() { std::cout << "Hello, World!"; return 0; } this should be a very simple piece of code, but iostream adds 3000+ lines to a marginal piece of code. So, are these 3000+ lines of code really needed to simply display a single line to the screen, and if not, do they create a less efficient program than if I simply copied the relevant lines into the code?

    Read the article

  • Good working habits to observe in project development?

    - by Will Marcouiller
    As my development experience grows, I see fit to stick to best practices from here and there to build somehow my own working practices while observing the conventions, etc. I'm currently working on a project which my goals is to graduate the security access model from an environment's Active Directory to another environment's automatically. I don't know for any of you, but as far as I'm concerned, I meet some real difficulties sticking to only one way, then develop. I mean, I learn something new everyday while visiting SO, and recently wanted to get acquainted with generics. On the other hand, I better know the Façade pattern which proved to be very practical in transactional programming in process systems. This seems to be less practical for desktop application as there are plenty of variables to consider in a desktop application that you don't have to care in transactional programming, as you're playing only with information data. As for my current project, I have: Groups; Organizational Units; Users. Which are all considered an entry in the Active Directory. This points out to be a good candidate for generics, as also approached this way by Bart de Smett's Linq to AD on CodePlex. He has a DirectorySource<T>, and to manage let's say groups, then he instantiate a source with the proper type: var groups = new DirectorySource<Group>(); This seems to be very a good way of doing. Despite, I seem to go from one pattern to another and I don't seem to be able to strictly stick to one. While I'm aware that one must not stay with only one way of doing, since each pattern statisfies certain advantages, while also illustrating disadvantages under some usage conditions, I seem to want to develop with both patterns having a singleton Façade class with the underlying factories which represent the sub systems: GroupsFactory; UsersFactory; OrganizationalUnitsFactory. Each of the factories offers the possible operations for their respective entity (group, user, OU). To make a very long story short, I often have plenty of ideas while developping and this causes me some trouble, as I go from an idea to another feeling completely lost after a while. Yet I understand the advantages and disavantages, I have no trouble choosing from one pattern to another depending on the situation. Nevertheless, when it comes to programming itself, if I'm not part of a team, I feel sometimes like I can't do anything good. That is, because I can't stand not doing something "perfect" the first time. The role I play within the project is both: the project manager and the programmer. I am more comfortable in the project manager role, architectural role, analytical role than the developer's. Has any of you some good habbits to observe in project development? Thanks to you all! =)

    Read the article

  • Using Partitions for a large MySQL table

    - by user293594
    An update on my attempts to implement a 505,000,000-row table on MySQL on my MacBook Pro: Following the advice given, I have partitioned my table, tr: i UNSIGNED INT NOT NULL, j UNSIGNED INT NOT NULL, A FLOAT(12,8) NOT NULL, nu BIGINT NOT NULL, KEY (nu), key (A) with a range on nu. nu ought to be a real number, but because I only have 6-d.p. accuracy and the maximum value of nu is 30000. I multiplied it by 10^8 made it a BIGINT - I gather one can't use FLOAT or DOUBLE values to PARTITION a MySQL table. Anyway, I have 15 partitions (p0: nu<25,000,000,000, p1: nu<50,000,000,000, etc.). I was thinking that this should speed up a typical to SELECT: SELECT * FROM tr WHERE nu>95000000000 AND nu<100000000000 AND A.>1. to something of the order of the same query on a table consisting of only the data in the relevant partition (<30 secs). But it's taking 30mins+ to return rows for queries within a partition and double that if the query is for rows spanning two (contiguous) partitions. I realise I could just have 15 different tables, and query them separately, but is there a way to do this 'automatically' with partitions? Has anyone got any suggestions?

    Read the article

  • Reading large excel file with PHP

    - by Itamar Bar-Lev
    I'm trying to read a 17MB excel file (2003) with PHPExcel1.7.3c, but it crushes already while loading the file, after exceeding the 120 seconds limit I have. Is there another library that can do it more efficiently? I have no need in styling, I only need it to support UTF8. Thanks for your help

    Read the article

  • Exporting Eclipse project with a reference to native library

    - by TacB0sS
    I have an Eclipse project, that uses JMF, I found out I could skip the JMF installation process and still to use the CaptureDeviceManager of the JMF, and to receive the list of devices if I could point my project to the native lib of the JMF. I've managed to add the native lib to the IDE run/debug, but once I export the application to an external runnable Jar, the application cannot find the native lib. the files are located in c:\JMF*.dll I tried to add the folder path to the environment variable in windows - didn't work. I tried to add them into another Jar and add it to the project - didn't work. I tried to add the files into the project - didn't work. I tried to add the path to the class path - didn't work. I tried to add the path to the library path - didn't work. does someone have any sort of a solution? Thanks in advance, Adam Zehavi.

    Read the article

  • Open source tool for hosting projects similar to "Google Project Hosting"

    - by Jeesmon
    We are looking for open source tool for hosting our internal projects like "Google Project Hosting". The tool should support individual wiki and version control for each project and it should be easy to configure for each project like in google code. We explored trac but seems it lack good support for multiple projects. The tool will be installed in our internal host and cannot use hosted service. A java based tool will be ideal.

    Read the article

  • Opening Large (24 GB) File In C

    - by zacaj
    I'm trying to read in a 24 GB XML file in C, but it won't work. I'm printing out the current position using ftell() as I read it in, but once it gets to a big enough number, it goes back to a small number and starts over, never even getting 20% through the file. I assume this is a problem with the range of the variable that's used to store the position (long), which can go up to about 4,000,000,000 according to http://msdn.microsoft.com/en-us/library/s3f49ktz%28VS.80%29.aspx, while my file is 25,000,000,000 bytes in size. A long long should work, but how would I change what my compiler(Cygwin/mingw32) uses or get it to have fopen64?

    Read the article

  • How to pick a chunksize for python multiprocessing with large datasets

    - by Sandro
    I am attempting to to use python to gain some performance on a task that can be highly parallelized using http://docs.python.org/library/multiprocessing. When looking at their library they say to use chunk size for very long iterables. Now, my iterable is not long, one of the dicts that it contains is huge: ~100000 entries, with tuples as keys and numpy arrays for values. How would I set the chunksize to handle this and how can I transfer this data quickly? Thank you.

    Read the article

  • Seamlessly use large background images on webpages

    - by Ben Shelock
    I want to have huge background images on my site but without giving the user a hard time downloading them and the site looking ugly as the background loads. They would be no bigger than 1920 X 1080 in size, however it's hard to say in terms of kilobytes/megabytes. What are my options here and which are most effective? I'm not too bothered about bandwidth, just want to user to think everything looks nice ;)

    Read the article

  • Android Development Eclise - Cant Create a New Android Project - Mac OS

    - by Ben Diamant
    I have an issue creating a new android project using the eclipse wizard, everything worked fine by yesterday. had a few project working. Now, when i press "Finish" on the final step of the wizard it remain open and an empty project with white-marked packages is added to the work branch, I tried to reinstall eclipse and it's sdk+plug, still nothing. Would really appreciate your assistance, Thank you in advance Ben

    Read the article

  • Export large amount of data from Oracle 10G to SQL Server 2005

    - by uniball
    Dear all, I need to export 100 million data rows (avg row length ~ 100 bytes) from Oracle 10G database table into SQL server (over WAN/VLAN with 6MBits/sec capacity) on a regular basis. So far, these are the options that I have tried and a quick summary. Has anyone tried this before? Are there other better options? Which option would be the best in terms of performance and reliability? The time taken has been calculated using tests on smaller amounts of data and then extrapolating it to estimate the time required. Using data import wizard on the SQL server or SSIS packages to import the data. It will take around 150 hours to complete the task. Using Oracle batch job to spool data into a comma-delimited flat-file. Then using SSIS package to FTP this file to the SQL server and then load directly from the flat-file. The issue here is the size of the flat-file which is expected to run in GBs. Although this option is drastically different, I am even considering the option of using Linked Server to query the Oracle data directly at run-time to avoid bringing in data. Performance is a big problem and I have limited control over the Oracle database in terms of creating table indexes. Regards, Uniball

    Read the article

  • Looking for alternatives to the database project.

    - by Dave
    I've a fairly large database project which contains nine databases and one database with a fairly large schema. This project takes a large amount of time to build and I'm about to pull my hair out. We'd like to keep our database source controlled, but having a hard getting the other devs to use the project and build the database project before checking in just because it takes so long to build. It is seriously crippling our work so I'm look for alternatives. Maybe something can be done with Redgate's SQL Compare? I think maybe the only drawback here is that it doesn't validate syntax? Anyone's thoughts/suggestions would be most appreciated.

    Read the article

  • Large Video Uploads via a website

    - by Andrew
    Some of the problems that can happen are timeouts, disconnections, and not being able to resume a file and having to start from the beginning. Assuming these files are up to around 5gigs in size, what is the best solution for dealing with this problem? I'm using a Drupal 6 install for the website. Some of my constraints due to the server setup I have to deal with: Shared hosting with max 200 connections at a time (unlimited disk space) Shared hosting. Unable to create users through an API (so can't automatically generate ftp accounts) I do have the ability to run cron-type scripts via a Drupal module. My initial thought was to create ftp users based off of Drupal accounts and requiring them to download an ftp client for their OS of choice. But the lack of API to auto-create ftp accounts and the inability to do it from command line kind of hinder that solution. If there's a workaround someone can think of, let me know! Thanks

    Read the article

  • Installing a custom project template with Visual Studio Installer project

    - by ulu
    Hi! I've created a custom project template, and now I need to deploy it together with my product (i.e., it should be installed by the same msi I use for the main installation). I'm using a Visual Studio Installer project. One option is to use a custom action and manually copy a template file included in the installation. Another is to create a vsi file and use a custom action to install it after the main installation (how do I have it installed silently?) . Which one is better? Thanks a lot ulu

    Read the article

  • [perl] Efficient processing of large text

    - by jesper
    I have text file that contains over one million urls. I have to process this file in order to assign urls to groups, based on host address: { 'http://www.ex1.com' = ['http://www.ex1.com/...', 'http://www.ex1.com/...', ...], 'http://www.ex2.com' = ['http://www.ex2.com/...', 'http://www.ex2.com/...', ...] } My current basic solution takes about 600mb of RAM to do this (size of file is about 300mb). Could You provide some more efficient ways? My current solution simply reads line by line, extracts host address by regex and put url into hash.

    Read the article

  • Fastest way for inserting very large number of records into a Table in SQL

    - by Irchi
    The problem is, we have a huge number of records (more than a million) to be inserted into a single table from a Java application. The records are created by the Java code, it's not a move from another table, so INSERT/SELECT won't help. Currently, my bottleneck is the INSERT statements. I'm using PreparedStatement to speed-up the process, but I can't get more than 50 recods per second on a normal server. The table is not complicated at all, and there are no indexes defined on it. The process takes too long, and the time it takes will make problems. What can I do to get the maximum speed (INSERT per second) possible? Database: MS SQL 2008. Application: Java-based, using Microsoft JDBC driver.

    Read the article

  • Large strings: Text files or SQL DB?

    - by Tommo
    I am coding a forum system using PHP. I am currently storing a threads ID, title, author, views and other attributes in an SQL database and then storing the thread body (the HTML and BBcode) in text files inside a folder named after the thread ID. In practise it's really simple to grab the database values then just grab the thread body from the text file, but I was wondering if this is the 'proper way'? I have personally no problems doing this but if it turns out it is massively inefficient and I should instead store both the thread body HTML and BBcode in the database instead then I will change. However, to me it seems wrong to store such a (very possibly) huge string of multi-line text along with lots of different characters in a database - I was taught that databases are more for short field 'values' rather than website content. I would just like a definitive answer to this because it's been bugging me for ages as to wherever I’ve been doing it properly. Does anyone know how popular forum systems store threads?

    Read the article

< Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26  | Next Page >