Search Results

Search found 18409 results on 737 pages for 'large projects'.

Page 74/737 | < Previous Page | 70 71 72 73 74 75 76 77 78 79 80 81 | Next Page >

Unable to load huge XML document (incorrectly suppose it's due to the XSLT processing)

- by krisvandenbergh

I'm trying to match certain elements using XSLT. My input document is very large and the source XML fails to load after processing the following code (consider especially the first line). <xsl:template match="XMI/XMI.content/Model_Management.Model/Foundation.Core.Namespace.ownedElement/Model_Management.Package/Foundation.Core.Namespace.ownedElement"> <rdf:RDF> <rdf:Description rdf:about=""> <xsl:for-each select="Foundation.Core.Class"> <xsl:for-each select="Foundation.Core.ModelElement.name"> <owl:Class rdf:ID="@Foundation.Core.ModelElement.name" /> </xsl:for-each> </xsl:for-each> </rdf:Description> </rdf:RDF> </xsl:template> Apparently the XSLT fails to load after "Model_Management.Model". The PHP code is as follows: if ($xml->loadXML($source_xml) == false) { die('Failed to load source XML: ' . $http_file); } It then fails to perform loadXML and immediately dies. I think there are two options now. 1) I should set a maximum executing time. Frankly, I don't know how that I do this for the built-in PHP 5 XSLT processor. 2) Think about another way to match. What would be the best way to deal with this? The input document can be found at http://krisvandenbergh.be/uml_pricing.xml Any help would be appreciated! Thanks.

Read the article
Getting started with massive data

- by Max

I'm a math guy and occasionally do some statistics/machine learning analysis consulting projects on the side. The data I have access to are usually on the smaller side, at most a couple hundred of megabytes (and almost always far less), but I want to learn more about handling and analyzing data on the gigabyte/terabyte scale. What do I need to know and what are some good resources to learn from? Hadoop/MapReduce is one obvious start. Is there a particular programming language I should pick up? (I primarily work now in Python, Ruby, R, and occasionally Java, but it seems like C and Clojure are often used for large-scale data analysis?) I'm not really familiar with the whole NoSQL movement, except that it's associated with big data. What's a good place to learn about it, and is there a particular implementation (Cassandra, CouchDB, etc.) I should get familiar with? Where can I learn about applying machine learning algorithms to huge amounts of data? My math background is mostly on the theory side, definitely not on the numerical or approximation side, and I'm guessing most of the standard ML algorithms don't really scale. Any other suggestions on things to learn would be great!

Read the article
Database structure for ecommerce site

- by imanc

Hey Guys, I have been tasked with designing an ecommerce solution. The aspect that is causing me the most problems is the database. Currently the site consists of 10+ country based shops each with their own database (all residing on the same mysql instance). For the new site I'd rather all these shop databases be merged into one database so that all tables (products, orders, customers etc.) have a shop_id field. From a programming perspective this seems to make the most sense as we won't have to manage data across multiple databases. Currently the entire site generates about 120k orders a year, but is experiencing fairly heavy growth and we need to design a solution that will scale. In 5 years there may be more than a million orders per year and a database that contains 5 years order history (archiving maybe a solution here). The question is - do we use a single database, or do we keep the database-per-shop structure? I am currently trying to find supporting evidence for either avenue. The company I am designing the solution for prefer the per-shop database structure because they believe it will allow the sites to scale. But my argument is that the shop's database probably won't get that busy over the next few years that they exceed the capacity of a mysql database and a "no expenses spared" hardware set-up. I am wondering if anyone has any advice either way? Does anyone have experience with websites / ecommerce sites that have tables containing millions of records? I know there is probably not a clear answer here, but at what stage do we have too many records or too large table files to have a fast loading site? Also, if anyone has any advice on sources of information - books, websites, etc. where I can do further research, it would be highly appreciated! Cheers, imanc

Read the article
Improving File Read Performance (single file, C++, Windows)

- by david

I have large (hundreds of MB or more) files that I need to read blocks from using C++ on Windows. Currently the relevant functions are: errorType LargeFile::read( void* data_out, __int64 start_position, __int64 size_bytes ) const { if( !m_open ) { // return error } else { seekPosition( start_position ); DWORD bytes_read; BOOL result = ReadFile( m_file, data_out, DWORD( size_bytes ), &bytes_read, NULL ); if( size_bytes != bytes_read || result != TRUE ) { // return error } } // return no error } void LargeFile::seekPosition( __int64 position ) const { LARGE_INTEGER target; target.QuadPart = LONGLONG( position ); SetFilePointerEx( m_file, target, NULL, FILE_BEGIN ); } The performance of the above does not seem to be very good. Reads are on 4K blocks of the file. Some reads are coherent, most are not. A couple questions: Is there a good way to profile the reads? What things might improve the performance? For example, would sector-aligning the data be useful? I'm relatively new to file i/o optimization, so suggestions or pointers to articles/tutorials would be helpful.

Read the article
Read/Write/Find/Replace huge csv file

- by notapipe

I have a huge (4,5 GB) csv file.. I need to perform basic cut and paste, replace operations for some columns.. the data is pretty well organized.. the only problem is I cannot play with it with Excel because of the size (2000 rows, 550000 columns). here is some part of the data: ID,Affection,Sex,DRB1_1,DRB1_2,SENum,SEStatus,AntiCCP,RFUW,rs3094315,rs12562034,rs3934834,rs9442372,rs3737728 D0024949,0,F,0101,0401,SS,yes,?,?,A_A,A_A,G_G,G_G D0024302,0,F,0101,7,SN,yes,?,?,A_A,G_G,A_G,?_? D0023151,0,F,0101,11,SN,yes,?,?,A_A,G_G,G_G,G_G I need to remove 4th, 5th, 6th, 7th, 8th and 9th columns; I need to find every _ character from column 10 onwards and replace it with a space ( ) character; I need to replace every ? with zero (0); I need to replace every comma with a tab; I need to remove first row (that has column names; I need to replace every 0 with 1, every 1 with 2 and every ? with 0 in 2nd column; I need to replace F with 2, M with 1 and ? with 0 in 3rd column; so that in the resulting file the output reads: D0024949 1 2 A A A A G G G G D0024302 1 2 A A G G A G 0 0 D0023151 1 2 A A G G G G G G (both input and output should read one line per row, ne extra blank row) Is there a memory efficient way of doing that with java(and I need a code to do that) or a usable tool for playing with this large data so that I can easily apply Excel functionality..

Read the article
MySQL - What is wrong with this query or my database? Terrible performance.

- by Moss

SELECT * from `employees` a LEFT JOIN (SELECT phone1 p1, count(*) c, FROM `employees` GROUP BY phone1) b ON a.phone1 = b.p1; I'm not sure if it is this query in particular that has the problem. I have been getting terrible performance in general with this database. The table in question has 120,000 rows. I have tried this particular query remotely and locally with the MyISAM and InnoDB engines, with different types of joins, and with and without an index on phone1. I can get this to complete in about 4 minutes on a 10,000 row table successfully but performance drops exponentially with larger tables. Remotely it will lose connection to the server and locally it brings my system to its knees and seems to go on forever. This query is only a smaller step I was trying to do when a larger query couldn't complete. Maybe I should explain the whole scenario. I have one big flat ugly table that lists a bunch of people and their contact info and the info of the companies they work for. I'm trying to normalize the database and intelligently determine which phone numbers apply to individual people and which apply to an office location. My reasoning is that if a phone number occurs multiple times and the number of occurrence equals the number of times that the street address it is attached to occurs then it must be an office number. So the first step is to count each phone number grouping by phone number. Normally if you just use COUNT()...GROUP BY it will only list the first record it finds in that group so I figured I have to join the full table to the count table where the phone number matches. This does work but as I said I can't successfully complete it on any table much larger than 10,000 rows. This seems pathetic and this doesn't seem like a crazy query to do. Is there a better way to achieve what I want or do I have to break my large table into 12 pieces or is there something wrong with the table or db?

Read the article
“I could use a little help here” or “I can do it myself, thank you” for Cloud Projects

- by BuckWoody

Windows Azure allows you to write code in languages within the .NET stack, you can use Java, C++, PHP, NodeJS and others. Code is code - other than keeping things stateless, using a Web or Worker Role in Azure is not all that different from working with an on-premises system. However…. Working in a scalable, component-based stateless architecture that can use federated security is not all that common for many developers. Some are used to owning the server, scaling up, and state-full paradigms that have a single security domain. Making the transition whilst trying to create a new software application or even port a previous one can be daunting. Sure, we have absolutely tons of free training, kits, videos, online books and more to learn on your own, but some things like architecture can be pivotal as you move along. So the question is, should you just strike out on your own for a Cloud project, or get Microsoft Consulting Services or another partner to work with you on your first one? I use a few decision points to help guide the projects I assist in. Note: I’m a huge fan of having help that ends up giving you training and leaves you in charge. If you do engage with someone to help you, make sure you keep this clear and take more and more ownership yourself as the project progresses. How much time do you have? Usually the first thing I ask is about the timeline for the project. It doesn’t matter how skilled you are, if you have a short window to get things done it’s better to get help - especially if this is your first cloud project. Having someone that knows the platform well can save you amazing amounts of time. If you have longer, then start with the training in the link above and once you feel confident, jump in. How complex is the project? If there are a lot of moving parts, it’s best to engage a partner. The reason is that certain interactions - particularly things like Service Bus or Data Integration - can be quite different than what you may have encountered before. How many people do you have? I have a “pizza rule” about projects I’ve used in my career - if it takes over two pizzas to feed everyone on the project, it’s too big and will fail. That being said, one developer and a one-week deadline does not a good project make, usually. It’s best to have at least one architect (or someone in that role) guiding the project along, and at least two developers to work on a cloud project. That’s a generalization of course, since I’ve seen great software on Azure with one developer writing code all by herself, but for more complex projects, more (to a point) is better. The nice thing about bringing on a partner is that you don’t have to hire them full time - they help you and then they go away. How critical is the project? There’s no shame in using some help. If the platform is new, if the project is large and complex, and if it is critical to the business, you should engage a partner. That’s regardless of Cloud or anything else - get some help. You don’t want to hit your company’s bottom line in a negative way, but you have to innovate and get them a competitive advantage. Do your research, make sure the partner is qualified to help you, and get it done. Don’t let these questions scare you off. There are lots of projects you can implement on Windows and SQL Azure with nothing other than the Software Development Kit (SDK) that you get for free with Windows Azure. And assistance comes in many forms - sometimes just phone support, a friend you can ask. Microsoft Consulting Services or any of our great partners. You can get help on just the architecture piece or have them show you how to write the code. They’ll get involved as little or as much as you like.

Read the article
Given a trace of packets, how would you group them into flows?

- by zxcvbnm

I've tried it these ways so far: 1) Make a hash with the source IP/port and destination IP/port as keys. Each position in the hash is a list of packets. The hash is then saved in a file, with each flow separated by some special characters/line. Problem: Not enough memory for large traces. 2) Make a hash with the same key as above, but only keep in memory the file handles. Each packet is then put into the hash[key] that points to the right file. Problems: Too many flows/files (~200k) and it might run out of memory as well. 3) Hash the source IP/port and destination IP/port, then put the info inside a file. The difference between 2 and 3 is that here the files are opened and closed for each operation, so I don't have to worry about running out of memory because I opened too many at the same time. Problems: WAY too slow, same number of files as 2 so also impractical. 4) Make a hash of the source IP/port pairs and then iterate over the whole trace for each flow. Take the packets that are part of that flow and place them into the output file. Problem: Suppose I have a 60 MB trace that has 200k flows. This way, I would process, say, a 60 MB file 200k times. Maybe removing the packets as I iterate would make it not so painful, but so far I'm not sure this would be a good solution. 5) Split them by IP source/destination and then create a single file for each one, separating the flows by special characters. Still too many files (+50k). Right now I'm using Ruby to do it, which might've been a bad idea, I guess. Currently I've filtered the traces with tshark so that they only have relevant info, so I can't really make them any smaller. I thought about loading everything in memory as described in 1) using C#/Java/C++, but I was wondering if there wouldn't be a better approach here, especially since I might also run out of memory later on even with a more efficient language if I have to use larger traces. In summary, the problem I'm facing is that I either have too many files or that I run out of memory. I've also tried searching for some tool to filter the info, but I don't think there is one. The ones I've found only return some statistics and wouldn't scan for every flow as I need.

Read the article
OpenSource Projects - Is there a site which lists projecs that need more developers?

- by Jamie

Morning/Afternoon/Evening all, Do any of you know of a website which lists opensource projects which are in need of more help? Let me elaborate, I would like to work on another open source project (I already work on a couple), however, it would be nice to have a site which lists lots of OS projects, their aims, deadlines, workload, how many more developers they are in need of etc. Of course, I could just pick a topic i'm interested in, find an OS project and then work on it, however, it would be nice to see a diversified list of projects. Primarily because some little known awesome projects get little attention and big projects such as jQuery forks, adium, gimp etc. etc. get a lot of attention because they are well known (and of course because they are great)and thus get a lot of developers working on them. It would be nice to see some little known projects getting more attention and thus hopefully drawing some people to work on them. Currently there are many websites hosting os projects, such as github, sourceforge, google code etc. A website to centralise all of this into one place and categorise it would be awesome. Let me know your thoughts please. I'm not looking for an answer per se, so I will mark it is as a community wiki. Your thoughts would be great.

Read the article
Windows Azure Service Bus Splitter and Aggregator

- by Alan Smith

This article will cover basic implementations of the Splitter and Aggregator patterns using the Windows Azure Service Bus. The content will be included in the next release of the “Windows Azure Service Bus Developer Guide”, along with some other patterns I am working on. I’ve taken the pattern descriptions from the book “Enterprise Integration Patterns” by Gregor Hohpe. I bought a copy of the book in 2004, and recently dusted it off when I started to look at implementing the patterns on the Windows Azure Service Bus. Gregor has also presented an session in 2011 “Enterprise Integration Patterns: Past, Present and Future” which is well worth a look. I’ll be covering more patterns in the coming weeks, I’m currently working on Wire-Tap and Scatter-Gather. There will no doubt be a section on implementing these patterns in my “SOA, Connectivity and Integration using the Windows Azure Service Bus” course. There are a number of scenarios where a message needs to be divided into a number of sub messages, and also where a number of sub messages need to be combined to form one message. The splitter and aggregator patterns provide a definition of how this can be achieved. This section will focus on the implementation of basic splitter and aggregator patens using the Windows Azure Service Bus direct programming model. In BizTalk Server receive pipelines are typically used to implement the splitter patterns, with sequential convoy orchestrations often used to aggregate messages. In the current release of the Service Bus, there is no functionality in the direct programming model that implements these patterns, so it is up to the developer to implement them in the applications that send and receive messages. Splitter A message splitter takes a message and spits the message into a number of sub messages. As there are different scenarios for how a message can be split into sub messages, message splitters are implemented using different algorithms. The Enterprise Integration Patterns book describes the splatter pattern as follows: How can we process a message if it contains multiple elements, each of which may have to be processed in a different way? Use a Splitter to break out the composite message into a series of individual messages, each containing data related to one item. The Enterprise Integration Patterns website provides a description of the Splitter pattern here. In some scenarios a batch message could be split into the sub messages that are contained in the batch. The splitting of a message could be based on the message type of sub-message, or the trading partner that the sub message is to be sent to. Aggregator An aggregator takes a stream or related messages and combines them together to form one message. The Enterprise Integration Patterns book describes the aggregator pattern as follows: How do we combine the results of individual, but related messages so that they can be processed as a whole? Use a stateful filter, an Aggregator, to collect and store individual messages until a complete set of related messages has been received. Then, the Aggregator publishes a single message distilled from the individual messages. The Enterprise Integration Patterns website provides a description of the Aggregator pattern here. A common example of the need for an aggregator is in scenarios where a stream of messages needs to be combined into a daily batch to be sent to a legacy line-of-business application. The BizTalk Server EDI functionality provides support for batching messages in this way using a sequential convoy orchestration. Scenario The scenario for this implementation of the splitter and aggregator patterns is the sending and receiving of large messages using a Service Bus queue. In the current release, the Windows Azure Service Bus currently supports a maximum message size of 256 KB, with a maximum header size of 64 KB. This leaves a safe maximum body size of 192 KB. The BrokeredMessage class will support messages larger than 256 KB; in fact the Size property is of type long, implying that very large messages may be supported at some point in the future. The 256 KB size restriction is set in the service bus components that are deployed in the Windows Azure data centers. One of the ways of working around this size restriction is to split large messages into a sequence of smaller sub messages in the sending application, send them via a queue, and then reassemble them in the receiving application. This scenario will be used to demonstrate the pattern implementations. Implementation The splitter and aggregator will be used to provide functionality to send and receive large messages over the Windows Azure Service Bus. In order to make the implementations generic and reusable they will be implemented as a class library. The splitter will be implemented in the LargeMessageSender class and the aggregator in the LargeMessageReceiver class. A class diagram showing the two classes is shown below. Implementing the Splitter The splitter will take a large brokered message, and split the messages into a sequence of smaller sub-messages that can be transmitted over the service bus messaging entities. The LargeMessageSender class provides a Send method that takes a large brokered message as a parameter. The implementation of the class is shown below; console output has been added to provide details of the splitting operation. public class LargeMessageSender { private static int SubMessageBodySize = 192 * 1024; private QueueClient m_QueueClient; public LargeMessageSender(QueueClient queueClient) { m_QueueClient = queueClient; } public void Send(BrokeredMessage message) { // Calculate the number of sub messages required. long messageBodySize = message.Size; int nrSubMessages = (int)(messageBodySize / SubMessageBodySize); if (messageBodySize % SubMessageBodySize != 0) { nrSubMessages++; } // Create a unique session Id. string sessionId = Guid.NewGuid().ToString(); Console.WriteLine("Message session Id: " + sessionId); Console.Write("Sending {0} sub-messages", nrSubMessages); Stream bodyStream = message.GetBody<Stream>(); for (int streamOffest = 0; streamOffest < messageBodySize; streamOffest += SubMessageBodySize) { // Get the stream chunk from the large message long arraySize = (messageBodySize - streamOffest) > SubMessageBodySize ? SubMessageBodySize : messageBodySize - streamOffest; byte[] subMessageBytes = new byte[arraySize]; int result = bodyStream.Read(subMessageBytes, 0, (int)arraySize); MemoryStream subMessageStream = new MemoryStream(subMessageBytes); // Create a new message BrokeredMessage subMessage = new BrokeredMessage(subMessageStream, true); subMessage.SessionId = sessionId; // Send the message m_QueueClient.Send(subMessage); Console.Write("."); } Console.WriteLine("Done!"); }} The LargeMessageSender class is initialized with a QueueClient that is created by the sending application. When the large message is sent, the number of sub messages is calculated based on the size of the body of the large message. A unique session Id is created to allow the sub messages to be sent as a message session, this session Id will be used for correlation in the aggregator. A for loop in then used to create the sequence of sub messages by creating chunks of data from the stream of the large message. The sub messages are then sent to the queue using the QueueClient. As sessions are used to correlate the messages, the queue used for message exchange must be created with the RequiresSession property set to true. Implementing the Aggregator The aggregator will receive the sub messages in the message session that was created by the splitter, and combine them to form a single, large message. The aggregator is implemented in the LargeMessageReceiver class, with a Receive method that returns a BrokeredMessage. The implementation of the class is shown below; console output has been added to provide details of the splitting operation. public class LargeMessageReceiver { private QueueClient m_QueueClient; public LargeMessageReceiver(QueueClient queueClient) { m_QueueClient = queueClient; } public BrokeredMessage Receive() { // Create a memory stream to store the large message body. MemoryStream largeMessageStream = new MemoryStream(); // Accept a message session from the queue. MessageSession session = m_QueueClient.AcceptMessageSession(); Console.WriteLine("Message session Id: " + session.SessionId); Console.Write("Receiving sub messages"); while (true) { // Receive a sub message BrokeredMessage subMessage = session.Receive(TimeSpan.FromSeconds(5)); if (subMessage != null) { // Copy the sub message body to the large message stream. Stream subMessageStream = subMessage.GetBody<Stream>(); subMessageStream.CopyTo(largeMessageStream); // Mark the message as complete. subMessage.Complete(); Console.Write("."); } else { // The last message in the sequence is our completeness criteria. Console.WriteLine("Done!"); break; } } // Create an aggregated message from the large message stream. BrokeredMessage largeMessage = new BrokeredMessage(largeMessageStream, true); return largeMessage; } } The LargeMessageReceiver initialized using a QueueClient that is created by the receiving application. The receive method creates a memory stream that will be used to aggregate the large message body. The AcceptMessageSession method on the QueueClient is then called, which will wait for the first message in a message session to become available on the queue. As the AcceptMessageSession can throw a timeout exception if no message is available on the queue after 60 seconds, a real-world implementation should handle this accordingly. Once the message session as accepted, the sub messages in the session are received, and their message body streams copied to the memory stream. Once all the messages have been received, the memory stream is used to create a large message, that is then returned to the receiving application. Testing the Implementation The splitter and aggregator are tested by creating a message sender and message receiver application. The payload for the large message will be one of the webcast video files from http://www.cloudcasts.net/, the file size is 9,697 KB, well over the 256 KB threshold imposed by the Service Bus. As the splitter and aggregator are implemented in a separate class library, the code used in the sender and receiver console is fairly basic. The implementation of the main method of the sending application is shown below. static void Main(string[] args) { // Create a token provider with the relevant credentials. TokenProvider credentials = TokenProvider.CreateSharedSecretTokenProvider (AccountDetails.Name, AccountDetails.Key); // Create a URI for the serivce bus. Uri serviceBusUri = ServiceBusEnvironment.CreateServiceUri ("sb", AccountDetails.Namespace, string.Empty); // Create the MessagingFactory MessagingFactory factory = MessagingFactory.Create(serviceBusUri, credentials); // Use the MessagingFactory to create a queue client QueueClient queueClient = factory.CreateQueueClient(AccountDetails.QueueName); // Open the input file. FileStream fileStream = new FileStream(AccountDetails.TestFile, FileMode.Open); // Create a BrokeredMessage for the file. BrokeredMessage largeMessage = new BrokeredMessage(fileStream, true); Console.WriteLine("Sending: " + AccountDetails.TestFile); Console.WriteLine("Message body size: " + largeMessage.Size); Console.WriteLine(); // Send the message with a LargeMessageSender LargeMessageSender sender = new LargeMessageSender(queueClient); sender.Send(largeMessage); // Close the messaging facory. factory.Close(); } The implementation of the main method of the receiving application is shown below. static void Main(string[] args) { // Create a token provider with the relevant credentials. TokenProvider credentials = TokenProvider.CreateSharedSecretTokenProvider (AccountDetails.Name, AccountDetails.Key); // Create a URI for the serivce bus. Uri serviceBusUri = ServiceBusEnvironment.CreateServiceUri ("sb", AccountDetails.Namespace, string.Empty); // Create the MessagingFactory MessagingFactory factory = MessagingFactory.Create(serviceBusUri, credentials); // Use the MessagingFactory to create a queue client QueueClient queueClient = factory.CreateQueueClient(AccountDetails.QueueName); // Create a LargeMessageReceiver and receive the message. LargeMessageReceiver receiver = new LargeMessageReceiver(queueClient); BrokeredMessage largeMessage = receiver.Receive(); Console.WriteLine("Received message"); Console.WriteLine("Message body size: " + largeMessage.Size); string testFile = AccountDetails.TestFile.Replace(@"\In\", @"\Out\"); Console.WriteLine("Saving file: " + testFile); // Save the message body as a file. Stream largeMessageStream = largeMessage.GetBody<Stream>(); largeMessageStream.Seek(0, SeekOrigin.Begin); FileStream fileOut = new FileStream(testFile, FileMode.Create); largeMessageStream.CopyTo(fileOut); fileOut.Close(); Console.WriteLine("Done!"); } In order to test the application, the sending application is executed, which will use the LargeMessageSender class to split the message and place it on the queue. The output of the sender console is shown below. The console shows that the body size of the large message was 9,929,365 bytes, and the message was sent as a sequence of 51 sub messages. When the receiving application is executed the results are shown below. The console application shows that the aggregator has received the 51 messages from the message sequence that was creating in the sending application. The messages have been aggregated to form a massage with a body of 9,929,365 bytes, which is the same as the original large message. The message body is then saved as a file. Improvements to the Implementation The splitter and aggregator patterns in this implementation were created in order to show the usage of the patterns in a demo, which they do quite well. When implementing these patterns in a real-world scenario there are a number of improvements that could be made to the design. Copying Message Header Properties When sending a large message using these classes, it would be great if the message header properties in the message that was received were copied from the message that was sent. The sending application may well add information to the message context that will be required in the receiving application. When the sub messages are created in the splitter, the header properties in the first message could be set to the values in the original large message. The aggregator could then used the values from this first sub message to set the properties in the message header of the large message during the aggregation process. Using Asynchronous Methods The current implementation uses the synchronous send and receive methods of the QueueClient class. It would be much more performant to use the asynchronous methods, however doing so may well affect the sequence in which the sub messages are enqueued, which would require the implementation of a resequencer in the aggregator to restore the correct message sequence. Handling Exceptions In order to keep the code readable no exception handling was added to the implementations. In a real-world scenario exceptions should be handled accordingly.

Read the article
Survey: How do you manage the source code for your personal projects?

- by Linchi Shea

This seems to be the survey season. Andy’s post on source controlling T-SQL code triggered a question that I always wanted to ask. Do you version control the source code for your various personal projects (i.e. not projects of your customer or employer)? Do you use a computer at home for your source control repository, or do you use a hosting service such as ProjectLocker ? If you do it yourself at home, what version control software you use? If you use a hosting service, what’s your experience?...(read more)

Read the article
SVN Authz - Any Subfolder permission or List contents

- by Jaspa Jones

Goal Basically I would like SVN users to be able to browse through a directory containing a lot of subfolders without allowing them to read its subfolders. [/] * = r [/Projects] * = # Allow viewing contents, but not reading. At least to be able to see Project1. [/Projects/Project1] my_group = rw Problem The problem is that there are a lot of projects. I could add every other project and make them disappear for the user, but that would be a lot of work to maintain. It would look like this: [/] * = r [/Projects] * = r [/Projects/Project1] my_group = rw [/Projects/Project2] * = [/Projects/Project3] * = [/Projects/Project4] * = [/Projects/Project5] * = It would be nice if I could use this: [/Projects/*] * = Any ideas? Thanks in advance, Jaspa Jones

Read the article
What is a good site to use for scheduling 20+ developers and 10 projects? (resource planning) [closed]

- by b-ryce

I have around 20 developers and 10 or so active projects. Then I get asked if my team can take on more work, and who is going to free up when. Currently we are using a spreadsheet to keep track :( I've been digging around for a few hours and haven't found anything that meets my requirements, which are: Web based Schedule a developer's time over a period of days/weeks/months Be able to see at a glance which developer has extra capacity Quickly see when the group could take on another large project I don't mind paying for the software (It does NOT need to be free) Two projects which look close are http://www.ganttic.com/tour and http://resourceguruapp.com/ What else are people using? Anyone have the perfect solution

Read the article
How does I/O work for large graph databases?

- by tjb1982

I should preface this by saying that I'm mostly a front end web developer, trained as a musician, but over the past few years I've been getting more and more into computer science. So one idea I have as a fun toy project to learn about data structures and C programming was to design and implement my own very simple database that would manage an adjacency list of posts. I don't want SQL (maybe I'll do my own query language? I'm just having fun). It should support ACID. It should be capable of storing 1TB let's say. So with that, I was trying to think of how a database even stores data, without regard to data structures necessarily. I'm working on linux, and I've read that in that world "everything is a file," including hardware (like /dev/*), so I think that that obviously has to apply to a database, too, and it clearly does--whether it's MySQL or PostgreSQL or Neo4j, the database itself is a collection of files you can see in the filesystem. That said, there would come a point in scale where loading the entire database into primary memory just wouldn't work, so it doesn't make sense to design it with that mindset (I assume). However, reading from secondary memory would be much slower and regardless some portion of the database has to be in primary memory in order for you to be able to do anything with it. I read this post: Why use a database instead of just saving your data to disk? And I found it difficult to understand how other databases, like SQLite or Neo4j, read and write from secondary memory and are still very fast (faster, it would seem, than simply writing files to the filesystem as the above question suggests). It seems the key is indexing. But even indexes need to be stored in secondary memory. They are inherently smaller than the database itself, but indexes in a very large database might be prohibitively large, too. So my question is how is I/O generally done with large databases like the one I described above that would be at least 1TB storing a big adjacency list? If indexing is more or less the answer, how exactly does indexing work--what data structures should be involved?

Read the article
If you had two projects with the same specification and only one was developed using TDD how could you tell?

- by Andrew

I was asked this question in an interview and it has been bugging me ever since. You have two projects, both with the same specification but only one of these projects was developed using Test Driven Development. You are given the source for both but with the tests removed from the TDD project. How can you tell which was developed using TDD? All I was able to muster up was something about the classes being more 'broken up' in to smaller chunks and having more visible APIs, not my proudest moment. I would be very interested to hear a good answer to this question.

Read the article
How can I create multiple identical AWS EC2 server instances with large amounts of persistent data?

- by mojones

I have a CPU-intensive data-processing application that I want to run across many (~100,000) input files. The application needs a large (~20GB) data file in order to run. What I would like to do is create an EC2 machine image that has my application and associated data files installed boot up a large number (e.g. 100) of instances of this image split my input files up into 100 batches and send one batch to be processed on each instance I am having trouble figuring out the best way to ensure that each instance has access to the large data file. The data file is too big to fit on the root filesystem of an AMI. I could use Block Storage, but a given Block Storage volume can only be attached to a single instance, so I would need 100 clones. Is there some way to create a custom image that has more space on the root filsystem so that I can include my large data file? Or is there a better way to tackle this problem?

Read the article
Breaking up a large PHP object used to abstract the database. Best practices?

- by John Kershaw

Two years ago it was thought a single object with functions such as $database->get_user_from_id($ID) would be a good idea. The functions return objects (not arrays), and the front-end code never worries about the database. This was great, until we started growing the database. There's now 30+ tables, and around 150 functions in the database object. It's getting impractical and unmanageable and I'm going to be breaking it up. What is a good solution to this problem? The project is large, so there's a limit to the extent I can change things. My current plan is to extend the current object for each table, then have the database object contain these. So, the above example would turn into (assume "user" is a table) $database->user->get_user_from_id($ID). Instead of one large file, we would have a file for every table.

Read the article
How well do free-to-open-source-projects policies work in practice?

- by Steve314

In comparison with an open source license and requesting donations, is a free-for-open-source-projects (or free for non-commercial developers) closed source and otherwise commercial project likely to get more license fees? Or just to alienate potential users? Assume the project has value to programmers - I'm looking for generalizations here, though specific examples comparing existing projects will be very interesting. What I have in mind involves code generating programming utilities. And one issue I can think of, either way, is a near total inability to enforce any license restrictions. After all, I can't go around the internet demanding that everyone show me their source code just in case!

Read the article
Best practices for including open source code from other public projects?

- by Bryan Kemp

If I use an existing open source project that is hosted for example on github within one of my projects, should I check in the code from the other project into my public repo or not? I have mixed feelings about this, #1 I want to give proper credit and attribution to the original developer, and if appropriate I will contribute back any changes I need to make. However given that I have developed / tested against a specific revision of the other projects code, that is the version that I want to distribute to users of my project. Here is the specific use case to illustrate my point. I am looking for a more generalized answer than this specific case. I am developing simple framework using rabbitmq and python for outbound messages that will allow for sending sms, twitter, email, and is extensible to support additional messaging buses as well. There is a project on github that will make the creation and sending of SMS messages developed by another person. When I create my own repo how do I account for the code that I am including from the other project?

Read the article
Are there compatibility issues opening Visual Studio Professional projects in Visual Studio Express, and vice versa? [migrated]

- by theGreenCabbage

Disclaimer: I have taken a look at the 50+ StackExchange forums to find the right place, and it seems /Programmers/ is the most suitable Exchange for this. If this is the wrong place to ask this, however, please let me know - I will personally delete the thread. I am in the process of downloading a single license for Visual Studio 2013 for my firm of 2-3 developers. One license is approximately $498.00 USD. As a small firm, our funds are short, but since we will be creating commercial software, we decided we will be needing the features of the Professional edition. At the same time, our decision is to use the Express edition for the rest of the two developers. My question is - will there be compatibility issues between Express projects and Professional projects for Visual Studio?

Read the article
Is it better to concentrate on one or two research projects throughout undergrad?

- by AruniRC

Currently in the 4th semester of engineering in an Indian university. The thing is - is it better to do as many short-lived projects/research work on diverse topics of computer science or stick to one/two projects consistently throughout my undergraduate years? Case in point: currently working on an image-processing project that promises to carry on for a year or so (as per the prof). Does this seem like being over-specialized at too early a level? Although taking on too many things will spread me out thin and in all probability not end up getting any meaningful work done. Especially as I hope to apply for grad school in the US. Would really appreciate any views and suggestions on this.

Read the article
Effective way of keeping past projects with their working development environment?

- by Korey Hinton

I find that whenever I want to go run a past project, it will take a long time before I can find it and before I have everything set-up again for it to be able to run. For example, I have python projects I created in Linux, and it depends on software packages that are easily installed in Linux, yet I no longer have the Linux VM I was using. And some of my other projects depend on other variables like web server configuration, PATH variables, sdk, IDE, OS version, device, etc. Does someone have an effective way of handling this issue? As of now I have only concerned myself with keeping the source code backed up yet it is difficult re-establish the working development environment and it is also difficult to keep the working development environment around as well.

Read the article
Places to find free software projects who need developers/project managers?

- by MHarrison

While I have plenty of project management "booksmarts" and a handful of PM experience, I don't seem to have enough experience to get the sort of job I want. Since "I read another PM book/blog today" doesn't really count, I was thinking I could find some free/open source software (FOSS) projects who are looking for/hiring project managers or developers and see if there was anything I could volunteer for. Does anyone know of any FOSS employment sites where I might be able to find such projects? Something similar to careers.stackoverflow.com. I know I could just go to sourceforge/freshmeat and look around, but I was hoping to find some site that fills this need (and if any such sites exist, my google-fu is apparently VERY weak at finding them).

Read the article
How to automatically mount a Windows shared folder on every boot up?

- by Zabba

I am able to access Windows' shared folder from Ubuntu 10.10 Nautilus like so: Type into the Location Bar : smb://box/projects Now, I can see the folder in Nautilus, create/read files in it. Also, on desktop I get a folder called "projects on box". But, that folder on the desktop goes away when I reboot. So, I thought that I can automount the Windows' shared projects folder by adding this to my fstab: //box/Projects /home/base/Projects smbfs rw,user,username=jack,password=www222,fmask=666,dmask=777 0 0 (base is my user name on Ubuntu) Now, I get a folder called "Projects" in my home folder after boot up, but it is empty (cannot see the same files that I can see in Nautilus). What's am I doing wrong? Some more detail: This is what I see of the Projects folder when I do ls -l in my home folder: ... drwxr-xr-x 2 root root 4096 2011-01-01 10:22 Projects drwxr-xr-x 2 base base 4096 2011-01-01 09:06 Public ... Note the two "roots". Is that somehow the problem?

Read the article
Can I copy large files faster without using the file cache?

- by Veazer

After adding the preload package, my applications seem to speed up but if I copy a large file, the file cache grows by more than double the size of the file. By transferring a single 3-4 GB virtualbox image or video file to an external drive, this huge cache seems to remove all the preloaded applications from memory, leading to increased load times and general performance drops. Is there a way to copy large, multi-gigabyte files without caching them (i.e. bypassing the file cache)? Or a way to whitelist or blacklist specific folders from being cached?

Read the article

Search Results

Search found 18409 results on 737 pages for 'large projects'.

Page 74/737 | < Previous Page | 70 71 72 73 74 75 76 77 78 79 80 81 | Next Page >

- by krisvandenbergh

- by Max

- by imanc

- by david

- by notapipe

- by Moss

- by BuckWoody

- by zxcvbnm

- by Jamie

- by Alan Smith

- by Linchi Shea

- by Jaspa Jones

- by b-ryce

- by tjb1982

- by Andrew

- by mojones

- by John Kershaw

- by Steve314

- by Bryan Kemp

- by theGreenCabbage

- by AruniRC

- by Korey Hinton

- by MHarrison

- by Zabba

- by Veazer

< Previous Page | 70 71 72 73 74 75 76 77 78 79 80 81 | Next Page >