Search Results

Search found 81583 results on 3264 pages for 'open data'.

Page 17/3264 | < Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24  | Next Page >

  • When to open source a project under development? [closed]

    - by QuasarDonkey
    Possible Duplicate: Is it OK to push my code to GitHub while it is still in early development? I've been working on a hobby project for a few months now; it's clocking in at over 15000 source lines of code. A number of people have expressed interest in joining development, and I have full intentions of going open source, since it would not be feasible for me to complete the project alone. I'm just not sure when to open-source it. For context, I've notice many successful open source projects, such as the Linux kernel, had considerable work done before they were open-sourced. In my case, I'd been planning on open-sourcing it after I complete all the underlying libraries and overall architecture. Is this a mistake; should I just release it right now? I'm worried that since certain critical underlying components haven't been finalized, if people build a large codebase around them, it will be very difficult to change or fix things later. On the other hand, it's a very large project that will require multiple developers to complete in a reasonable time. So when is the right time during development to go open source? Preferably, I'd like to hear from some folks who have started their own projects.

    Read the article

  • Clever ways of implementing different data structures in C & data structures that should be used mor

    - by Yktula
    What are some clever (not ordinary) ways of implementing data structures in C, and what are some data structures that should be used more often? For example, what is the most effective way (generating minimal overhead) to implement a directed and cyclic graph with weighted edges in C? I know that we can store the distances in an array as is done here, but what other ways are there to implement this kind of a graph?

    Read the article

  • Using GameKit to transfer CoreData data between iPhones, via NSDictionary

    - by OscarTheGrouch
    I have an application where I would like to exchange information, managed via Core Data, between two iPhones. First turning the Core Data object to an NSDictionary (something very simple that gets turned into NSData to be transferred). My CoreData has 3 string attributes, 2 image attributes that are transformables. I have looked through the NSDictionary API but have not had any luck with it, creating or adding the CoreData information to it. Any help or sample code regarding this would be greatly appreciated.

    Read the article

  • core-data relationships and data structure.

    - by Boaz
    What is the right way to build iPhone core data for this SMS like app (with location)? - I want to represent an entity of conversation with "profile1" "profile2" that heritage from a profile entity, and a message entity with: "to" "from" "body" where the "to" and "from" are equal to "profile1" and/or "profile2" in the conversation entity. How can I make such a relationships? is there a better way to represent the data (other structure)? Thanks

    Read the article

  • [Visual C++]Forcing memory alignment of variables/data-structures

    - by John
    I'm looking at using SSE and I gather aligning data on 16byte boundaries is recommended. There are two cases to consider: float data[4]; struct myystruct { float x,y,z,w; }; I'm not sure the first case can be done explicitly, though there's perhaps a compiler option I could use? In the second case I remember being able to control packing in old versions of GCC several years back, is this still possible?

    Read the article

  • SQL SERVER – Data Pages in Buffer Pool – Data Stored in Memory Cache

    - by pinaldave
    This will drop all the clean buffers so we will be able to start again from there. Now, run the following script and check the execution plan of the query. Have you ever wondered what types of data are there in your cache? During SQL Server Trainings, I am usually asked if there is any way one can know how much data in a table is stored in the memory cache? The more detailed question I usually get is if there are multiple indexes on table (and used in a query), were the data of the single table stored multiple times in the memory cache or only for a single time? Here is a query you can run to figure out what kind of data is stored in the cache. USE AdventureWorks GO SELECT COUNT(*) AS cached_pages_count, name AS BaseTableName, IndexName, IndexTypeDesc FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN ( SELECT s_obj.name, s_obj.index_id, s_obj.allocation_unit_id, s_obj.OBJECT_ID, i.name IndexName, i.type_desc IndexTypeDesc FROM ( SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id ,allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3) UNION ALL SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id, allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2 ) AS s_obj LEFT JOIN sys.indexes i ON i.index_id = s_obj.index_id AND i.OBJECT_ID = s_obj.OBJECT_ID ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id WHERE database_id = DB_ID() GROUP BY name, index_id, IndexName, IndexTypeDesc ORDER BY cached_pages_count DESC; GO Now let us run the query above and observe the output of the same. We can see in the above query that there are four columns. Cached_Pages_Count lists the pages cached in the memory. BaseTableName lists the original base table from which data pages are cached. IndexName lists the name of the index from which pages are cached. IndexTypeDesc lists the type of index. Now, let us do one more experience here. Please note that you should not run this test on a production server as it can extremely reduce the performance of the database. DBCC DROPCLEANBUFFERS This will drop all the clean buffers and we will be able to start again from there. Now run following script and check the execution plan for the same. USE AdventureWorks GO SELECT UnitPrice, ModifiedDate FROM Sales.SalesOrderDetail WHERE SalesOrderDetailID BETWEEN 1 AND 100 GO The execution plans contain the usage of two different indexes. Now, let us run the script that checks the pages cached in SQL Server. It will give us the following output. It is clear from the Resultset that when more than one index is used, datapages related to both or all of the indexes are stored in Memory Cache separately. Let me know what you think of this article. I had a great pleasure while writing this article because I was able to write on this subject, which I like the most. In the next article, we will exactly see what data are cached and those that are not cached, using a few undocumented commands. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL DMV

    Read the article

  • Call for Abstracts Now Open for Microsoft ASP.NET Connections (Closing April 26)

    - by plitwin
    We are putting out a call for abstracts to present at the Fall 2010 Microsoft ASP.NET Connections conference in Las Vegas, Nov 9-13 2009. The due date for submissions is April 26, 2010. For submitting sessions, please use this URL: http://www.deeptraining.com/devconnections/abstracts Please keep the abstracts under 200 words each and in one paragraph. No bulleted items and line breaks, and please use a spell-checker. Do not email abstracts, you need to use the web-based tool to submit them. Please submit at least 3 abstracts, but it would help your chances of being selected if you submitted 5 or more abstracts. Also, you are encouraged to suggest all-day pre or post conference workshops as well. We need to finalize the conference content and the tracks layout in just a few short weeks, so we need your abstracts by April 26th. No exceptions will be granted on late submissions! Topics of interest include (but are not limited to):* ASP.NET Webforms* ASP.NET AJAX* ASP.NET MVC* Dynamic Data* Anything else related to ASP.NET For Fall 2010, we are having a seperate Silverlight conference where you can submit abstracts for Silverlight and Windows 7 Phone Development. In fact, you can use the same URL to submit sessions to Microsoft ASP.NET Connections, Silverlight Connections, Visual Studio Connections, or SQL Server Connections. The URL again is:http://www.deeptraining.com/devconnections/abstracts Please realize that while we want a lot of the new and the cool, it's also okay to propose sessions on the more mundane "real world" stuff as it pertains to ASP.NET. What you will get if selected:* $500 per regular conference talk.* Compensation for full-day workshops ranges from $500 for 1-20 attendees to $2500 for 200+ attendees.* Coach airfare and hotel stay paid by the conference.* Free admission to all of the co-located conferences* Speaker party* The adoration of attendees* etc. Your continued suport of Microsoft ASP.NET Connections and the other DevConnections conferences is appreciated. Good luck and thank you,Paul LitwinMicrosoft ASP.NET Conference Chair

    Read the article

  • Bitmask data insertions in SSDT Post-Deployment scripts

    - by jamiet
    On my current project we are using SQL Server Data Tools (SSDT) to manage our database schema and one of the tasks we need to do often is insert data into that schema once deployed; the typical method employed to do this is to leverage Post-Deployment scripts and that is exactly what we are doing. Our requirement is a little different though, our data is split up into various buckets that we need to selectively deploy on a case-by-case basis. I was going to use a SQLCMD variable for each bucket (defaulted to some value other than “Yes”) to define whether it should be deployed or not so we could use something like this in our Post-Deployment script: IF ($(DeployBucket1Flag) = 'Yes')BEGIN   :r .\Bucket1.data.sqlENDIF ($(DeployBucket2Flag) = 'Yes')BEGIN   :r .\Bucket2.data.sqlENDIF ($(DeployBucket3Flag) = 'Yes')BEGIN   :r .\Bucket3.data.sqlEND That works fine and is, I’m sure, a very common technique for doing this. It is however slightly ugly because we have to litter our deployment with various SQLCMD variables. My colleague James Rowland-Jones (whom I’m sure many of you know) suggested another technique – bitmasks. I won’t go into detail about how this works (James has already done that at Using a Bitmask - a practical example) but I’ll summarise by saying that you can deploy different combinations of the buckets simply by supplying a different numerical value for a single SQLCMD variable. Each bit of that value’s binary representation signifies whether a particular bucket should be deployed or not. This is better demonstrated using the following simple script (which can be easily leveraged inside your Post-Deployment scripts): /* $(DeployData) is a SQLCMD variable that would, if you were using this in SSDT, be declared in the SQLCMD variables section of your project file. It should contain a numerical value, defaulted to 0. In this example I have declared it using a :setvar statement. Test the affect of different values by changing the :setvar statement accordingly. Examples: :setvar DeployData 1 will deploy bucket 1 :setvar DeployData 2 will deploy bucket 2 :setvar DeployData 3   will deploy buckets 1 & 2 :setvar DeployData 6   will deploy buckets 2 & 3 :setvar DeployData 31  will deploy buckets 1, 2, 3, 4 & 5 */ :setvar DeployData 0 DECLARE  @bitmask VARBINARY(MAX) = CONVERT(VARBINARY,$(DeployData)); IF (@bitmask & 1 = 1) BEGIN     PRINT 'Bucket 1 insertions'; END IF (@bitmask & 2 = 2) BEGIN     PRINT 'Bucket 2 insertions'; END IF (@bitmask & 4 = 4) BEGIN     PRINT 'Bucket 3 insertions'; END IF (@bitmask & 8 = 8) BEGIN     PRINT 'Bucket 4 insertions'; END IF (@bitmask & 16 = 16) BEGIN     PRINT 'Bucket 5 insertions'; END An example of running this using DeployData=6 The binary representation of 6 is 110. The second and third significant bits of that binary number are set to 1 and hence buckets 2 and 3 are “activated”. Hope that makes sense and is useful to some of you! @Jamiet P.S. I used the awesome HTML Copy feature of Visual Studio’s Productivity Power Tools in order to format the T-SQL code above for this blog post.

    Read the article

  • Looking for Cutting-Edge Data Integration: 2014 Excellence Awards

    - by Sandrine Riley
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 It is nomination time!!! This year's Oracle Fusion Middleware Excellence Awards will honor customers and partners who are creatively using various products across Oracle Fusion Middleware. Think you have something unique and innovative with one or a few of our Oracle Data Integration products? We would love to hear from you! Please submit today. The deadline for the nomination is June 20, 2014. What you win: An Oracle Fusion Middleware Innovation trophy One free pass to Oracle OpenWorld 2014 Priority consideration for placement in Profit magazine, Oracle Magazine, or other Oracle publications & press release Oracle Fusion Middleware Innovation logo for inclusion on your own Website and/or press release Let us reminisce a little… For details on the 2013 Data Integration Winners: Royal Bank of Scotland’s Market and International Banking and The Yalumba Wine Company, check out this blog post: 2013 Oracle Excellence Awards for Fusion Middleware Innovation… and the Winners for Data Integration are… and for details on the 2012 Data Integration Winners: Raymond James and Morrisons, check out this blog post: And the Winners of Fusion Middleware Innovation Awards in Data Integration are…  Now to view the 2013 Winners (for all categories). We hope to honor you! Here's what you need to do:  Click here to submit your nomination today.  And just a reminder: the deadline to submit a nomination is 5pm Pacific Time on June 20, 2014. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

    Read the article

  • Help with Perl persistent data storage using Data::Dumper

    - by stephenmm
    I have been trying to figure this out for way to long tonight. I have googled it to death and none of the examples or my hacks of the examples are getting it done. It seems like this should be pretty easy but I just cannot get it. Here is the code: #!/usr/bin/perl -w use strict; use Data::Dumper; my $complex_variable = {}; my $MEMORY = "$ENV{HOME}/data/memory-file"; $complex_variable->{ 'key' } = 'value'; $complex_variable->{ 'key1' } = 'value1'; $complex_variable->{ 'key2' } = 'value2'; $complex_variable->{ 'key3' } = 'value3'; print Dumper($complex_variable)."TEST001\n"; open M, ">$MEMORY" or die; print M Data::Dumper->Dump([$complex_variable], ['$complex_variable']); close M; $complex_variable = {}; print Dumper($complex_variable)."TEST002\n"; # Then later to restore the value, it's simply: do $MEMORY; #eval $MEMORY; print Dumper($complex_variable)."TEST003\n"; And here is my output: $VAR1 = { 'key2' => 'value2', 'key1' => 'value1', 'key3' => 'value3', 'key' => 'value' }; TEST001 $VAR1 = {}; TEST002 $VAR1 = {}; TEST003 Everything that I read says that the TEST003 output should look identical to the TEST001 output which is exactly what I am trying to achieve. What am I missing here? Should I be "do"ing differently or should I be "eval"ing instead and if so how? Thanks for any help...

    Read the article

  • Relational database data explorer / visualization?

    - by Ian Boyd
    Is there a tool that can let one browse relational data as a graph of connected nodes? For example, i'm faced with trying to cleanse some anomolous data. i can start with two offending rows. In this particular example, the TransactionID should, by business rules, be unique to the table, but i find a transaction that violates that rule: SELECT * FROM LCTTrans WHERE TransactionID = 1075048 LCTID TransactionID ========= ============= 4358 1075048 4359 1075048 2 row(s) affected But really what i want to begin to hunt down all the related data, to try to see which is right. So this hypothetical software would start by showing me these two rows: Next, i want to see that transaction that is linked into this table: Now that transaction points to an MAL, so show me that: Now lets add those two LCTs, that the transaction is "on". A transaction can be on only one LCT, yet this one is pointing to two: Okay computer, both of those LCTs point to an MAL and the transaction that created them, show me those: Those last two transactions, they also point at an MAL, and they themselves point to an LCT, show me those: Okay, now are there any entries in LCTTrans that point to LCTs 4358 or 4359?... And so on, and so on. Now i did all this manually, running single selects, copying and pasting uniqueidentifier keys and converting them into friendly id numbers so i could easily see the relationships. Is there software that can do this?

    Read the article

  • How to properly set relationships in Core Data when using setValue and data already exists

    - by ern
    Let's say I have two objects: Articles and Categories. For the sake of this example all relevant categories have already been added to the data store. When looping through data that holds edits for articles, there is category relationship information that needs to be saved. I was planning on using the -setValue method in the Article class in order to set the relationships like so: - (void)setValue:(id)value forUndefinedKey:(NSString *)key { if([key isEqualToString:@"categories"]){ NSLog(@"trying to set categories..."); } } The problem is that value isn't a Category, it is just a string (or array of strings) holding the title of a category. I could certainly do a lookup within this method for each category and assign it, but that seems inefficient when processing a whole bunch of articles at once. Another option is to populate an array of all possible categories and just filter, but my question is where to store that array? Should it be a class method on Article? Is there a way to pass in additional data to the -setValue method? Is there another, better option for setting the relationship I'm not thinking of? Thanks for your help.

    Read the article

  • Does likewise-open > version 5.4 contain CIFS support?

    - by Ben Andken
    I'm trying to get the CIFS server working in likewise-open. I've found this set of instructions and everything seems to work until I try to connect ([url]http://www.likewise.com/resources/documentation_library/manuals/cifs/likewise-cifs-smb-file-server-guide.html#id2765992):[/url] 1.6. Build and Configure a Standalone Likewise-CIFS Server This section demonstrates how to build and configure a standalone instance of Likewise-CIFS from the command line. The following procedure assumes that you want to set up Likewise-CIFS on a Linux server to share files with Windows computers in a network without Active Directory. This procedure also assumes you know how to build Linux applications from their source code and then install them. Download Likewise-CIFS from its open source git location: $ git clone git://git.likewiseopen.org/ Download, build, and install the following tools. The tools listed are known to work, but earlier or later versions might work as well. Also, instead of downloading the tools, you might be able to install them on your platform with apt-get or some other means. http://ftp.gnu.org/gnu/autoconf/autoconf-2.65.tar.gz http://ftp.gnu.org/gnu/automake/automake-1.9.6.tar.gz http://ftp.gnu.org/gnu/libtool/libtool-2.2.6a.tar.gz http://pkgconfig.freedesktop.org/releases/pkg-config-0.23.tar.gz gcc --version 3.x or greater Build Likewise-CIFS: $ cd likewise-open $ build/mkcomp --debug all Install Likewise-CIFS: $ sudo su $ cd staging/install-root $ tar cf - . | (cd / && tar xvf -) Make sure Samba is not running: $ /etc/init.d/smb stop Make sure SELinux is either disabled or set to permissive. Make sure the ports required by Likewise are open. For a list of ports that Likewise uses, see the Likewise Open Installation and Administration Guide. Configure Likewise Open: $ /etc/init.d/lwsmd start $ for i in /etc/likewise/*.reg; do /opt/likewise/bin/lwregshell upgrade $i; done $ /etc/init.d/lwsmd stop $ /etc/init.d/lwsmd start $ /opt/likewise/bin/lwsm start srvsvc $ /opt/likewise/bin/domainjoin-cli configure --enable nsswitch Add a user account to the local Likewise provider database. In the following example, substitute the account name that you want for newuser. $ /opt/likewise/bin/lw-add-user --home /home/newuser --shell /bin/bash newuser Successfully added user newuser Enable the user and set the password: $ /opt/likewise/bin/lw-mod-user --enable-user --set-password newuser New Password: ********** Successfully modified user newuser Look up new user's identity as follows. Substitute the value from the command hostname -s for the hostname. Keep in mind that Likewise truncates a hostname longer than 15 characters to the first 15 characters of the string. % id hostname\\newuser uid=2000(HOSTNAME\newuser) gid=1800(HOSTNAME\Likewise Users) groups=1800(HOSTNAME\Likewise Users) context=system_u:system_r:unconfined_t:s0 Make a CIFS directory for the user: mkdir /lwcifs/newuser chown 2000:1800 /lwcifs/newuser From a Windows computer, map the Likewise-CIFS drive share: Computer->Map Network Drive... Folder: \\IP_hostname\c$ Click "Finish" Username: hostname\newuser Password: user_password The last step fails when I try to connect. I've tried with Windows XP Pro and Windows 7 Pro. The rest of the directions only appear to work for version 5.4 (the one that shipped with 10.04). For 12.04, version 6.1 is the only one available and it doesn't appear to have the srvsvc module mentioned in these instructions. Is CIFS support dropped in the 6.1 version of likewise-open?

    Read the article

  • Centos running Apache Tomcat keep getting "java.net.SocketException: Too many open files"

    - by Gerard Moroney
    We're running Apache Tomcat 7.0.41 on CentOS 6 with java version "1.7.0_21". We were getting a lot of too many open files errors so I did some research. The consensus was that it was to to with the number of open files. So I did the following: Increased max files in /etc/security/limits.conf soft nofile 100000 hard nofile 100000 Rebooted the server Checked the limits were valid for the user which was to run the process [app_admin@xxx ~]$ ulimit -Hn 100000 [app_admin@xxx ~]$ ulimit -Sn 100000 Monitored open files on the server using the lsof command What I observed was when the total open files reached circa 13000 and tomcat had around 4500 open files the error reappeared. I am confused. I thought it would have resolved the problem but clearly I don't fully understand the root cause and also how to set the parameter correctly. To (maybe) help I have not modified the server.xml file for Tomcat (although I'm tempted). I don't want to start fiddling with that and make things worse. I'm more than happy to share any more information if someone can give me some hints on where to start looking.

    Read the article

  • Clearos open vpn vs windows open vpn client where client connects with no default gateway

    - by Paul
    Am using clearos as open vpn server and configured my users on windows machine with open vpn client. My problem is that users connect to the server without a default gateway and also with ip conflicts, i can ping the server but i can not ping any user behind the server. please any one can help to find out what causes the clients to connect without a default gateway and also not to be able to ping any user behind the clearos open vpn server. Help with a step by step guide of installing open vpn on clearos and open vpn clients on windows. Thanks

    Read the article

  • Suggested Web Application Framework and Database for Enterprise, “Big-Data” App?

    - by willOEM
    I have a web application that I have been developing for a small group within my company over the past few years, using Pipeline Pilot (plus jQuery and Python scripting) for web development and back-end computation, and Oracle 10g for my RDBMS. Users upload experimental genomic data, which is parsed into a database, and made available for querying, transformation, and reporting. Experimental data sets are large and have many layers of metadata. A given experimental data record might have a foreign key relationship with a table that describes this data point's assay. Assays can cover multiple genes, which can have multiple transcript, which can have multiple mutations, which can affect multiple signaling pathways, etc. Users need to approach this data from any point in those layers in the metadata. Since all data sets for a given data type can run over a billion rows, this results in some large, dynamic queries that are hard to predict. New data sets are added on a weekly basis (~1GB per set). Experimental data is never updated, but the associated metadata can be updated weekly for a few records and yearly for most others. For every data set insert the system sees, there will be between 10 and 100 selects run against it and associated data. It is okay for updates and inserts to run slow, so long as queries run quick and are as up-to-date as possible. The application continues to grow in size and scope and is already starting to run slower than I like. I am worried that we have about outgrown Pipeline Pilot, and perhaps Oracle (as the sole database). Would a NoSQL database or an OLAP system be appropriate here? What web application frameworks work well with systems like this? I'd like the solution to be something scalable, portable and supportable X-years down the road. Here is the current state of the application: Web Server/Data Processing: Pipeline Pilot on Windows Server + IIS Database: Oracle 10g, ~1TB of data, ~180 tables with several billion-plus row tables Network Storage: Isilon, ~50TB of low-priority raw data

    Read the article

  • Simple ADF page using BAM Data Control

    - by [email protected]
    var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); try { var pageTracker = _gat._getTracker("UA-15829414-1"); pageTracker._trackPageview(); } catch(err) {} Purpose : In this blog I will walk you through very simple steps to create an ADF page using BAM data control connection.Details : Create the projectOpen JDeveloper (make sure you have installed the SOA extension for JDev)Create new Application using "Generic Application" template.Click on "Next"Shuttle  "ADF Faces" to right pane for the project technology.Click "Finish"Create a BAM connectionIn the resource palette click on "Folder->New Connection -> BAM"Enter the connection name and click "Next"Enter Connection details Click on "Test connection" and "Finish"Create the BAM Data Control Open the IDE connection created in above step.Drag and drop "Employees" to "Data controls" palette.Select "Flat Query" and Click "Finish".Create the View Create a new JSF page.From Data control Panel drag and drop "Employees->Query->ADF Read Only table"Right click and Run the page.

    Read the article

  • Data Aggregation of CSV files java

    - by royB
    I have k csv files (5 csv files for example), each file has m fields which produce a key and n values. I need to produce a single csv file with aggregated data. I'm looking for the most efficient solution for this problem, speed mainly. I don't think by the way that we will have memory issues. Also I would like to know if hashing is really a good solution because we will have to use 64 bit hashing solution to reduce the chance for a collision to less than 1% (we are having around 30000000 rows per aggregation). For example file 1: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,50,60,70,80 a3,b2,c4,60,60,80,90 file 2: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,30,50,90,40 a3,b2,c4,30,70,50,90 result: f1,f2,f3,v1,v2,v3,v4 a1,b1,c1,80,110,160,120 a3,b2,c4,90,130,130,180 algorithm that we thought until now: hashing (using concurentHashTable) merge sorting the files DB: using mysql or hadoop or redis. The solution needs to be able to handle Huge amount of data (each file more than two million rows) a better example: file 1 country,city,peopleNum england,london,1000000 england,coventry,500000 file 2: country,city,peopleNum england,london,500000 england,coventry,500000 england,manchester,500000 merged file: country,city,peopleNum england,london,1500000 england,coventry,1000000 england,manchester,500000 The key is: country,city. This is just an example, my real key is of size 6 and the data columns are of size 8 - total of 14 columns. We would like that the solution will be the fastest in regard of data processing.

    Read the article

  • SQL – Download FREE Book – Data Access for HighlyScalable Solutions: Using SQL, NoSQL, and Polyglot Persistence

    - by Pinal Dave
    Recently I was preparing for Big Data and I ended up on very interesting read for everybody. This is created by Microsoft and it is indeed a fantastic read as per my opinion. It took me some time to read this entire book but it was worth reading this as it tried to answer two of the very interesting questions related to muscle. Here is the abstract from the book: Organizations seeking to use a NoSQL database are therefore faced with a twofold challenge: • Which NoSQL database(s) best meet(s) the needs of the organization? • How does an organization integrate a NoSQL database into its solutions? As I keep on reading the book, I find it very interesting and informative. I suggest if you have time this weekend, download the book and read it. This guide focuses on the most common types of NoSQL database currently available, describes the situations for which they are most suited, and shows examples of how you might incorporate them into a business application. The guide summarizes the experiences of a fictitious organization named Adventure Works, who implemented a solution that comprised an assortment of different databases. Download Data Access for HighlyScalable Solutions:  Using SQL, NoSQL,  and Polyglot Persistence While we are talking about Big Data and NoSQL do not forget to check out my tomorrow’s blog as I am going to talk about the same subject and it will be very interesting. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, NoSQL, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

  • How to analyze data

    - by Subhash Dike
    We are working on an application that allows user to search/read some content in a particular domain. We wanted to add some capability in the app which can suggest user some content based on the usage pattern (analyze data based on frequency and relevance). Currently every time user search or read something we do store that information in backend database. We would like to use this data to present some additional content to user. Could someone explain what kind of tools will be required for such a job and any example? And what this concept is called, data analysis? data mining? business intelligence? or something else? Update: Sorry for being too broad, here is an example SQL Database (Just to give an idea, actual db is little different with normalization and stuff) Table: UserArticles Fields: UserName | ArticleId | ArticleTitle | DateVisited | ArticleCategory Table: CategoryArticles Fields: Category | Article Title | Author etc. One Category may have one more articles. One user may have read the same article multiple times (in this case we place additional entry in the user article table. Task: Use the information availabel in UserArticle table and rank categories in order which would be presented to user automatically in other part of application. Factors to be considered are frequency and recency. This might be possible through simple queries or may require specialized tools. Either way, the task is what mention above. I am not too sure which route to take, hence the question. Thoughts??

    Read the article

  • Data structure for bubble shooter game

    - by SundayMonday
    I'm starting to make a bubble shooter game for a mobile OS. Assume this is just the basic "three or more same-color bubbles that touch pop" and all bubbles that are separated from their group fall/pop. What data structures are common for storing the bubbles? I've considered using an undirected, connected graph where each node is a bubble. This seems like it could help answer the question "which bubbles (if any) should fall now?" after some arbitrary bubbles are popped and corresponding nodes are removed from the graph. I think the answer is all bubbles that were just disconnected from the graph should fall. However the graph approach might be overkill so I'm not sure. Another consideration for the data structure is collision detection. Perhaps being able to grab a list of neighboring bubbles in constant time for a particular "bubble slot" is useful. So the collision detection would be something like "moving bubble is closest to slot ij, neighbors of slot ij are bubbles a,b,c, moving bubble is sufficiently close to bubble b hence moving bubble should come to rest in slot ij". A game like this could be probably be made with a relatively crude grid structure as the primary data structure. However it seems like answering "which bubbles (if any) should fall now?" would be trickier with this data structure.

    Read the article

  • Can JSON be made easily and safely editable by the non-technical Excel crowd?

    - by glitch
    I'm looking for a data storage format that's very intuitive and easy to edit. It should be ideally targeted towards the same crowd as Excel. At the same time I would like the data structure to be a tree. Ideally this would be JSON, since it offers both the tree aspect and allows for more interesting constructs like arrays. That and parsing libraries for JSON are ubiquitous, so I don't have to reinvent the wheel. The problem is that, at least with a non-specialized text editor, JSON is a giant pain to edit for a non-technical user. I'm thinking along the lines of someone who might have used Excel in the past, but never a real text editor. Someone who might not be comfortable with the idea of preserving JSON syntax by hand. Are there data formats out there that would fit this profile? I'd very much prefer this to be a JSON actually, but then it would require a solid editing tool that would hide the underlying implementation from the user. Think Excel and how it abstracts CSV syntax from the user. The reason I'm looking for something like this is because the team has been working with pretty hierarchical data for a while now and we've hit the limits of how easy it is to represent in simple CSVs without having to create complex rules for how represent hierarchy semantics from each row. Any suggestions?

    Read the article

< Previous Page | 13 14 15 16 17 18 19 20 21 22 23 24  | Next Page >