transactional data - Page 11

Choices in Architecture, Design, Algorithms, Data Structures for effective RDF Reasoning and Querying in a Big Data Environment [on hold]

- by user2891213

As part of my academic project I would like to know what choices in Architecture, Design, Algorithms, Data Structures do we need in order to provide effective and efficient RDF Reasoning and Querying in a Big Data Environment. Basically I want to get info regarding below points: What are the Systems and Software to get appropriate Architecture? What kind of API layer(s) would we need on top of the Big Data stores, to make this possible? The Indexing structures we will need. The appropriate Algorithms, and appropriate Algorithms for Query Planning across Big Data stores. The Performance Analysis and Cost Models we will need to justify the design decisions we have made along the way. Can anyone please provide pointers.. Thanks, David

Read the article

Uploading your data using the Maps Data API just got easier

The Google Maps Data API is a great way to host your geographic data on Google’s scalable, high-performance servers, making your data accessible across platforms using HTTP or...

Read the article

Big Data – ClustrixDB – Extreme Scale SQL Database with Real-time Analytics, Releases Software Download – NewSQL

- by Pinal Dave

There are so many things to learn and there is so little time we all have. As we have little time we need to be selective to learn whatever we learn. I believe I know quite a lot of things in SQL but I still do not know what is around SQL. I have started to learn about NewSQL recently. If you wonder what is NewSQL I encourage all of you to read my blog post about NewSQL over here Big Data – Buzz Words: What is NewSQL – Day 10 of 21. NewSQL databases are quickly becoming popular – providing the scale of NoSQL with the SQL features and transactions. As a part of learning NewSQL database, I have recently started to learn about ClustrixDB. ClustrixDB has been the most mature NewSQL database used by some of the largest internet sites in the world for over 3 years, with extensive SQL support. In addition to scale, it provides fast real-time analytics by bringing massively parallel processing (MPP), available only in warehousing databases, to the transactional database. The reason I am more intrigued about learning ClustrixDB is their recent announcement on Oct 31. ClustrixDB was only available as an appliance, but now with their software release on Oct 31, everyone can use it. It is now available as forever free for up to 12 cores with community support, and there is a 45 day trial for unlimited cluster sizes. With the forever free world, I am indeed interested in ClustrixDB now. I know that few of the leading eCommerce sites in the world uses them for their transactional database. Here are few of the details I have quickly noted for ClustrixDB. ClustrixDB allows user to: Scale by simply adding nodes to the cluster with a single command Run billions of transactions a day Run fast real-time analytics Achieve high-availability with recovery from node failure Manages itself Easily migrate from MySQL as it is nearly plug-and-play compatible, use MySQL drivers, tools and replication. While I was going through the documentation I realized that ClustrixDB also has extensive support for SQL features including complex queries involving joins on a dozen or more tables, aggregates, sorts, sub-queries. It also supports stored procedures, triggers, foreign keys, partitioned and temporary tables, and fully online schema changes. It is indeed a very matured product and SQL solution. Indeed Clusterix sound very promising solution, I decided to dig a bit deeper to understand who are current customers of the Clustrix as they exist in the industry for quite a few years. Their client list is indeed very interesting and here is my quick research about them. Twoo.com – Europe’s largest social discovery (dating) site runs 4.4 Billion Transactions a day with table sizes over a Terabyte, on a 168 core cluster. EngageBDR – Top 3 in the online advertising category uses ClustrixDB to serve 6.9 billion ads a day through real-time bidding platform. Their reports went from 4 hours to 15 seconds. NoMoreRack – Top 2 fastest growing e-commerce company in US used ClustrixDB for high availability and fast growth through Amazon cloud. MakeMyTrip – India’s leading travel site runs on ClustrixDB with two clusters running as multi-master in Chennai and Bangalore. Many enterprises such as AOL, CSC, Rakuten, Symantec use ClustrixDB when their applications need scale. I must accept that I am impressed with the information I have learned so far and now is the time to do some hand’s on experience with their product. I want to learn this technology so in future when it is about NewSQL, I know what I am talking about. Read more why Clustrix explains why you ClustrixDB might be the right database for you. Download ClustrixDB with me today and install it on your machine so in future when we discuss the technical aspects of it, we all are on the same page. The software can be downloaded here. Reference : Pinal Dave (http://blog.SQLAuthority.com)Filed under: Big Data, MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: Clustrix

Read the article

Data Structure Behind Amazon S3s Keys (Filtering Data Structure)

- by dimo414

I'd like to implement a data structure similar to the lookup functionality of Amazon S3. For those of you who don't know what I'm taking about, Amazon S3 stores all files at the root, but allows you to look up groups of files by common prefixes in their names, therefore replicating the power of a directory tree without the complexity of it. The catch is, both lookup and filter operations are O(1) (or close enough that even on very large buckets - S3's disk equivalents - both operations might as well be O(1))). So in short, I'm looking for a data structure that functions like a hash map, with the added benefit of efficient (at the very least not O(n)) filtering. The best I can come up with is extending HashMap so that it also contains a (sorted) list of contents, and doing a binary search for the range that matches the prefix, and returning that set. This seems slow to me, but I can't think of any other way to do it. Does anyone know either how Amazon does it, or a better way to implement this data structure?

Read the article

Best Data Structure For Time Series Data

- by TriParkinson

Hi all, I wonder if someone could take a minute out of their day to give their two cents on my problem. I would like some suggestions on what would be the best data structure for representing, on disk, a large data set of time series data. The main priority is speed of insertion, with other priorities in decreasing order; speed of retrieval, size on disk, size in memory, speed of removal. I have seen that B+ trees are often used in database because of their fast search times, but how about for fast insertion times? Is a linked list really the way to go? Thanks in advance for your time, Tri

Read the article

MySQL: Blank row in table after LOAD DATA INFILE

- by Tom

Hi, I'm uploading a large amount of data from a CSV (I'm doing it via MySQL Workbench): LOAD DATA INFILE 'C:/development/mydoc.csv' INTO TABLE mydatabase.mytable CHARACTER SET utf8 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r'; However, I'm noticing that it keeps adding an empty line full of nulls/zeros after the last record. I'm guessing it's because of the "LINES TERMINATED" command. However, I need that to load the data in correctly. Is there some way around this / some better SQL to avoid the blank row in the table? Thanks

Read the article

Update tableview instantly as data pushed in core data iphone

- by user336685

I need to update the tableview as soon as the content is pushed in core data database. for this AppDelegate.m contains following code NSManagedObjectContext *moc = [self managedObjectContext]; NSFetchRequest *request = [[NSFetchRequest alloc] init]; [request setEntity:[NSEntityDescription entityForName:@"FeedItem" inManagedObjectContext:moc]]; //for loop // push data in code data & then save context [moc save:&error]; ZAssert(error == nil, @"Error saving context: %@", [error localizedDescription]); //for loop ends This code triggers following code from RootviewController.m - (void)controllerWillChangeContent:(NSFetchedResultsController*)controller { [[self tableView] beginUpdates]; } But this updates the tableview only at the end of the for loop ,the table does not get updated after immediate push in db. I tried following code but that didn't work - (void)controllerDidChangeContent:(NSFetchedResultsController *)controller { // In the simplest, most efficient, case, reload the table view. [self.tableView reloadData]; } I have been stuck with this problem for several days.Please help.Thanks in advance for solution.

Read the article

Core Data data type for just the date - not including time

- by Jason

I am new at Core Data, and it seems like it is a great way to manage the data store. However I am also very memory-conscious due to the fact that the iPhone doesn't have that much of it. I was a little surprised to see that the data types are so limited - eg. there is a Date type which includes also the time, but no Date type for just the date! All the time information takes up precious bytes of memory, if I just wanted an attribute with the date (e.g. 2/15/2010 rather than 2/15/2010 02:34:48), how could I do this? Is it possible?

Read the article

Clever ways of implementing different data structures in C & data structures that should be used mor

- by Yktula

What are some clever (not ordinary) ways of implementing data structures in C, and what are some data structures that should be used more often? For example, what is the most effective way (generating minimal overhead) to implement a directed and cyclic graph with weighted edges in C? I know that we can store the distances in an array as is done here, but what other ways are there to implement this kind of a graph?

Read the article

Using GameKit to transfer CoreData data between iPhones, via NSDictionary

- by OscarTheGrouch

I have an application where I would like to exchange information, managed via Core Data, between two iPhones. First turning the Core Data object to an NSDictionary (something very simple that gets turned into NSData to be transferred). My CoreData has 3 string attributes, 2 image attributes that are transformables. I have looked through the NSDictionary API but have not had any luck with it, creating or adding the CoreData information to it. Any help or sample code regarding this would be greatly appreciated.

Read the article

Good practices for multiple language data in Core Data

- by Luca Bartoletti

Hi, i need a multilingual coredata db in my iphone app. I could create different database for each language but i hope that in iphone sdk exist an automatically way to manage data in different language core data like for resources and string. Someone have some hints?

Read the article

how to find maximum frequent item sets from large transactional data file

- by ANIL MANE

Hi, I have the input file contains large amount of transactions like Transaction ID Items T1 Bread, milk, coffee, juice T2 Juice, milk, coffee T3 Bread, juice T4 Coffee, milk T5 Bread, Milk T6 Coffee, Bread T7 Coffee, Bread, Juice T8 Bread, Milk, Juice T9 Milk, Bread, Coffee, T10 Bread T11 Milk T12 Milk, Coffee, Bread, Juice i want the occurrence of every unique item like Item Name Count Bread 9 Milk 8 Coffee 7 Juice 6 and from that i want an a fp-tree now by traversing this tree i want the maximal frequent itemsets as follows The basic idea of method is to dispose nodes in each “layer” from bottom to up. The concept of “layer” is different to the common concept of layer in a tree. Nodes in a “layer” mean the nodes correspond to the same item and be in a linked list from the “Head Table”. For nodes in a “layer” NBN method will be used to dispose the nodes from left to right along the linked list. To use NBN method, two extra fields will be added to each node in the ordered FP-Tree. The field tag of node N stores the information of whether N is maximal frequent itemset, and the field count’ stores the support count information in the nodes at left. In Figure, the first node to be disposed is “juice: 2”. If the min_sup is equal to or less than 2 then “bread, milk, coffee, juice” is a maximal frequent itemset. Firstly output juice:2 and set the field tag of “coffee:3” as “false” (the field tag of each node is “true” initially ). Next check whether the right four itemsets juice:1 be the subset of juice:2. If the itemset one node “juice:1” corresponding to is the subset of juice:2 set the field tag of the node “false”. In the following process when the field tag of the disposed node is FALSE we can omit the node after the same tagging. If the min_sup is more than 2 then check whether the right four juice:1 is the subset of juice:2. If the itemset one node “juice:1” corresponding to is the subset of juice:2 then set the field count’ of the node with the sum of the former count’ and 2 After all the nodes “juice” disposed ,begin to dispose the node “coffee:3”. Any suggestions or available source code, welcome. thanks in advance

Read the article

core-data relationships and data structure.

- by Boaz

What is the right way to build iPhone core data for this SMS like app (with location)? - I want to represent an entity of conversation with "profile1" "profile2" that heritage from a profile entity, and a message entity with: "to" "from" "body" where the "to" and "from" are equal to "profile1" and/or "profile2" in the conversation entity. How can I make such a relationships? is there a better way to represent the data (other structure)? Thanks

Read the article

convert unstructured data to structured data?

- by Codeguru

How to convert unstructured data into structured data?For example email,contacts to a structured format. Are there any algorithms to do this??

Read the article

When is the Data Vault model the right model for a data-warehouse?

- by Stephan Eggermont

I recently found a reference to 'Data Vault modeling' as a model for data-warehouses. The models I've seen before are Inmon and Kimball. The author refers to possible performance problems due to the joins needed. It looks like a nice model, but I wonder about the gotcha's. Are there any experience reports on-line?

Read the article

Spring Data Neo4J @Indexed(unique = true) not working

- by Markus Lamm

I'm new to Neo4J and I have, probably an easy question. There're NodeEntitys in my application, a property (name) is annotated with @Indexed(unique = true) to achieve the uniqueness like I do in JPA with @Column(unique = true). My problem is, that when I persist an entity with a name that already exists in my graph, it works fine anyway. But I expected some kind of exception here...?! Here' s an overview over basic my code: @NodeEntity public abstract class BaseEntity implements Identifiable { @GraphId private Long entityId; ... } public class Role extends BaseEntity { @Indexed(unique = true) private String name; ... } public interface RoleRepository extends GraphRepository<Role> { Role findByName(String name); } @Service public class RoleServiceImpl extends BaseEntityServiceImpl<Role> implements { private RoleRepository repository; @Override @Transactional public T save(final T entity) { return getRepository().save(entity); } } And this is my test: @Test public void testNameUniqueIndex() { final List<Role> roles = Lists.newLinkedList(service.findAll()); final String existingName = roles.get(0).getName(); Role newRole = new Role.Builder(existingName).build(); newRole = service.save(newRole); } That's the point where I expect something to go wrong! How can I ensure the uniqueness of a property, without checking it for myself?? THANKS IN ADVANCE FOR ANY IDEAS!! P.S.: I'm using neo4j 1.8.M07, spring-data-neo4j 2.1.0.BUILD-SNAPSHOT and Spring 3.1.2.RELEASE.

Read the article

[Visual C++]Forcing memory alignment of variables/data-structures

- by John

I'm looking at using SSE and I gather aligning data on 16byte boundaries is recommended. There are two cases to consider: float data[4]; struct myystruct { float x,y,z,w; }; I'm not sure the first case can be done explicitly, though there's perhaps a compiler option I could use? In the second case I remember being able to control packing in old versions of GCC several years back, is this still possible?

Read the article

SQL SERVER – Data Pages in Buffer Pool – Data Stored in Memory Cache

- by pinaldave

This will drop all the clean buffers so we will be able to start again from there. Now, run the following script and check the execution plan of the query. Have you ever wondered what types of data are there in your cache? During SQL Server Trainings, I am usually asked if there is any way one can know how much data in a table is stored in the memory cache? The more detailed question I usually get is if there are multiple indexes on table (and used in a query), were the data of the single table stored multiple times in the memory cache or only for a single time? Here is a query you can run to figure out what kind of data is stored in the cache. USE AdventureWorks GO SELECT COUNT(*) AS cached_pages_count, name AS BaseTableName, IndexName, IndexTypeDesc FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN ( SELECT s_obj.name, s_obj.index_id, s_obj.allocation_unit_id, s_obj.OBJECT_ID, i.name IndexName, i.type_desc IndexTypeDesc FROM ( SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id ,allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3) UNION ALL SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id, allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2 ) AS s_obj LEFT JOIN sys.indexes i ON i.index_id = s_obj.index_id AND i.OBJECT_ID = s_obj.OBJECT_ID ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id WHERE database_id = DB_ID() GROUP BY name, index_id, IndexName, IndexTypeDesc ORDER BY cached_pages_count DESC; GO Now let us run the query above and observe the output of the same. We can see in the above query that there are four columns. Cached_Pages_Count lists the pages cached in the memory. BaseTableName lists the original base table from which data pages are cached. IndexName lists the name of the index from which pages are cached. IndexTypeDesc lists the type of index. Now, let us do one more experience here. Please note that you should not run this test on a production server as it can extremely reduce the performance of the database. DBCC DROPCLEANBUFFERS This will drop all the clean buffers and we will be able to start again from there. Now run following script and check the execution plan for the same. USE AdventureWorks GO SELECT UnitPrice, ModifiedDate FROM Sales.SalesOrderDetail WHERE SalesOrderDetailID BETWEEN 1 AND 100 GO The execution plans contain the usage of two different indexes. Now, let us run the script that checks the pages cached in SQL Server. It will give us the following output. It is clear from the Resultset that when more than one index is used, datapages related to both or all of the indexes are stored in Memory Cache separately. Let me know what you think of this article. I had a great pleasure while writing this article because I was able to write on this subject, which I like the most. In the next article, we will exactly see what data are cached and those that are not cached, using a few undocumented commands. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL DMV

Read the article

Bitmask data insertions in SSDT Post-Deployment scripts

- by jamiet

On my current project we are using SQL Server Data Tools (SSDT) to manage our database schema and one of the tasks we need to do often is insert data into that schema once deployed; the typical method employed to do this is to leverage Post-Deployment scripts and that is exactly what we are doing. Our requirement is a little different though, our data is split up into various buckets that we need to selectively deploy on a case-by-case basis. I was going to use a SQLCMD variable for each bucket (defaulted to some value other than “Yes”) to define whether it should be deployed or not so we could use something like this in our Post-Deployment script: IF ($(DeployBucket1Flag) = 'Yes')BEGIN :r .\Bucket1.data.sqlENDIF ($(DeployBucket2Flag) = 'Yes')BEGIN :r .\Bucket2.data.sqlENDIF ($(DeployBucket3Flag) = 'Yes')BEGIN :r .\Bucket3.data.sqlEND That works fine and is, I’m sure, a very common technique for doing this. It is however slightly ugly because we have to litter our deployment with various SQLCMD variables. My colleague James Rowland-Jones (whom I’m sure many of you know) suggested another technique – bitmasks. I won’t go into detail about how this works (James has already done that at Using a Bitmask - a practical example) but I’ll summarise by saying that you can deploy different combinations of the buckets simply by supplying a different numerical value for a single SQLCMD variable. Each bit of that value’s binary representation signifies whether a particular bucket should be deployed or not. This is better demonstrated using the following simple script (which can be easily leveraged inside your Post-Deployment scripts): /* $(DeployData) is a SQLCMD variable that would, if you were using this in SSDT, be declared in the SQLCMD variables section of your project file. It should contain a numerical value, defaulted to 0. In this example I have declared it using a :setvar statement. Test the affect of different values by changing the :setvar statement accordingly. Examples: :setvar DeployData 1 will deploy bucket 1 :setvar DeployData 2 will deploy bucket 2 :setvar DeployData 3 will deploy buckets 1 & 2 :setvar DeployData 6 will deploy buckets 2 & 3 :setvar DeployData 31 will deploy buckets 1, 2, 3, 4 & 5 */ :setvar DeployData 0 DECLARE @bitmask VARBINARY(MAX) = CONVERT(VARBINARY,$(DeployData)); IF (@bitmask & 1 = 1) BEGIN PRINT 'Bucket 1 insertions'; END IF (@bitmask & 2 = 2) BEGIN PRINT 'Bucket 2 insertions'; END IF (@bitmask & 4 = 4) BEGIN PRINT 'Bucket 3 insertions'; END IF (@bitmask & 8 = 8) BEGIN PRINT 'Bucket 4 insertions'; END IF (@bitmask & 16 = 16) BEGIN PRINT 'Bucket 5 insertions'; END An example of running this using DeployData=6 The binary representation of 6 is 110. The second and third significant bits of that binary number are set to 1 and hence buckets 2 and 3 are “activated”. Hope that makes sense and is useful to some of you! @Jamiet P.S. I used the awesome HTML Copy feature of Visual Studio’s Productivity Power Tools in order to format the T-SQL code above for this blog post.

Read the article

Looking for Cutting-Edge Data Integration: 2014 Excellence Awards

- by Sandrine Riley

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 It is nomination time!!! This year's Oracle Fusion Middleware Excellence Awards will honor customers and partners who are creatively using various products across Oracle Fusion Middleware. Think you have something unique and innovative with one or a few of our Oracle Data Integration products? We would love to hear from you! Please submit today. The deadline for the nomination is June 20, 2014. What you win: An Oracle Fusion Middleware Innovation trophy One free pass to Oracle OpenWorld 2014 Priority consideration for placement in Profit magazine, Oracle Magazine, or other Oracle publications & press release Oracle Fusion Middleware Innovation logo for inclusion on your own Website and/or press release Let us reminisce a little… For details on the 2013 Data Integration Winners: Royal Bank of Scotland’s Market and International Banking and The Yalumba Wine Company, check out this blog post: 2013 Oracle Excellence Awards for Fusion Middleware Innovation… and the Winners for Data Integration are… and for details on the 2012 Data Integration Winners: Raymond James and Morrisons, check out this blog post: And the Winners of Fusion Middleware Innovation Awards in Data Integration are… Now to view the 2013 Winners (for all categories). We hope to honor you! Here's what you need to do: Click here to submit your nomination today. And just a reminder: the deadline to submit a nomination is 5pm Pacific Time on June 20, 2014. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

Read the article

Help with Perl persistent data storage using Data::Dumper

- by stephenmm

I have been trying to figure this out for way to long tonight. I have googled it to death and none of the examples or my hacks of the examples are getting it done. It seems like this should be pretty easy but I just cannot get it. Here is the code: #!/usr/bin/perl -w use strict; use Data::Dumper; my $complex_variable = {}; my $MEMORY = "$ENV{HOME}/data/memory-file"; $complex_variable->{ 'key' } = 'value'; $complex_variable->{ 'key1' } = 'value1'; $complex_variable->{ 'key2' } = 'value2'; $complex_variable->{ 'key3' } = 'value3'; print Dumper($complex_variable)."TEST001\n"; open M, ">$MEMORY" or die; print M Data::Dumper->Dump([$complex_variable], ['$complex_variable']); close M; $complex_variable = {}; print Dumper($complex_variable)."TEST002\n"; # Then later to restore the value, it's simply: do $MEMORY; #eval $MEMORY; print Dumper($complex_variable)."TEST003\n"; And here is my output: $VAR1 = { 'key2' => 'value2', 'key1' => 'value1', 'key3' => 'value3', 'key' => 'value' }; TEST001 $VAR1 = {}; TEST002 $VAR1 = {}; TEST003 Everything that I read says that the TEST003 output should look identical to the TEST001 output which is exactly what I am trying to achieve. What am I missing here? Should I be "do"ing differently or should I be "eval"ing instead and if so how? Thanks for any help...

Read the article

Relational database data explorer / visualization?

- by Ian Boyd

Is there a tool that can let one browse relational data as a graph of connected nodes? For example, i'm faced with trying to cleanse some anomolous data. i can start with two offending rows. In this particular example, the TransactionID should, by business rules, be unique to the table, but i find a transaction that violates that rule: SELECT * FROM LCTTrans WHERE TransactionID = 1075048 LCTID TransactionID ========= ============= 4358 1075048 4359 1075048 2 row(s) affected But really what i want to begin to hunt down all the related data, to try to see which is right. So this hypothetical software would start by showing me these two rows: Next, i want to see that transaction that is linked into this table: Now that transaction points to an MAL, so show me that: Now lets add those two LCTs, that the transaction is "on". A transaction can be on only one LCT, yet this one is pointing to two: Okay computer, both of those LCTs point to an MAL and the transaction that created them, show me those: Those last two transactions, they also point at an MAL, and they themselves point to an LCT, show me those: Okay, now are there any entries in LCTTrans that point to LCTs 4358 or 4359?... And so on, and so on. Now i did all this manually, running single selects, copying and pasting uniqueidentifier keys and converting them into friendly id numbers so i could easily see the relationships. Is there software that can do this?

Read the article

How to properly set relationships in Core Data when using setValue and data already exists

- by ern

Let's say I have two objects: Articles and Categories. For the sake of this example all relevant categories have already been added to the data store. When looping through data that holds edits for articles, there is category relationship information that needs to be saved. I was planning on using the -setValue method in the Article class in order to set the relationships like so: - (void)setValue:(id)value forUndefinedKey:(NSString *)key { if([key isEqualToString:@"categories"]){ NSLog(@"trying to set categories..."); } } The problem is that value isn't a Category, it is just a string (or array of strings) holding the title of a category. I could certainly do a lookup within this method for each category and assign it, but that seems inefficient when processing a whole bunch of articles at once. Another option is to populate an array of all possible categories and just filter, but my question is where to store that array? Should it be a class method on Article? Is there a way to pass in additional data to the -setValue method? Is there another, better option for setting the relationship I'm not thinking of? Thanks for your help.

Read the article

SQL SERVER – Reducing CXPACKET Wait Stats for High Transactional Database

- by pinaldave

While engaging in a performance tuning consultation for a client, a situation occurred where they were facing a lot of CXPACKET Waits Stats. The client asked me if I could help them reduce this huge number of wait stats. I usually receive this kind of request from other client as well, but the important thing to understand is whether this question has any merits or benefits, or not. Before we continue the resolution, let us understand what CXPACKET Wait Stats are. The official definition suggests that CXPACKET Wait Stats occurs when trying to synchronize the query processor exchange iterator. You may consider lowering the degree of parallelism if a conflict concerning this wait type develops into a problem. (from BOL) In simpler words, when a parallel operation is created for SQL Query, there are multiple threads for a single query. Each query deals with a different set of the data (or rows). Due to some reasons, one or more of the threads lag behind, creating the CXPACKET Wait Stat. Threads which came first have to wait for the slower thread to finish. The Wait by a specific completed thread is called CXPACKET Wait Stat. Note that CXPACKET Wait is done by completed thread and not the one which are unfinished. “Note that not all the CXPACKET wait types are bad. You might experience a case when it totally makes sense. There might also be cases when this is also unavoidable. If you remove this particular wait type for any query, then that query may run slower because the parallel operations are disabled for the query.” Now let us see what the best practices to reduce the CXPACKET Wait Stats are. The suggestions, with which you will find that if you search online through the browser, would play a major role as and might be asked about their jobs In addition, might tell you that you should set ‘maximum degree of parallelism’ to 1. I do agree with these suggestions, too; however, I think this is not the final resolutions. As soon as you set your entire query to run on single CPU, you will get a very bad performance from the queries which are actually performing okay when using parallelism. The best suggestion to this is that you set ‘the maximum degree of parallelism’ to a lower number or 1 (be very careful with this – it can create more problems) but tune the queries which can be benefited from multiple CPU’s. You can use query hint OPTION (MAXDOP 0) to run the server to use parallelism. Here is the two-quick script which helps to resolve these issues: Change MAXDOP at Server Level EXEC sys.sp_configure N'max degree of parallelism', N'1' GO RECONFIGURE WITH OVERRIDE GO Run Query with all the CPU (using parallelism) USE AdventureWorks GO SELECT * FROM Sales.SalesOrderDetail ORDER BY ProductID OPTION (MAXDOP 0) GO Below is the blog post which will help you to find all the parallel query in your server. SQL SERVER – Find Queries using Parallelism from Cached Plan Please note running Queries in single CPU may worsen your performance and it is not recommended at all. Infect this can be very bad advise. I strongly suggest that you identify the queries which are offending and tune them instead of following any other suggestions. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQL White Papers, SQLAuthority News, T SQL, Technology

Read the article

Suggested Web Application Framework and Database for Enterprise, “Big-Data” App?

- by willOEM

I have a web application that I have been developing for a small group within my company over the past few years, using Pipeline Pilot (plus jQuery and Python scripting) for web development and back-end computation, and Oracle 10g for my RDBMS. Users upload experimental genomic data, which is parsed into a database, and made available for querying, transformation, and reporting. Experimental data sets are large and have many layers of metadata. A given experimental data record might have a foreign key relationship with a table that describes this data point's assay. Assays can cover multiple genes, which can have multiple transcript, which can have multiple mutations, which can affect multiple signaling pathways, etc. Users need to approach this data from any point in those layers in the metadata. Since all data sets for a given data type can run over a billion rows, this results in some large, dynamic queries that are hard to predict. New data sets are added on a weekly basis (~1GB per set). Experimental data is never updated, but the associated metadata can be updated weekly for a few records and yearly for most others. For every data set insert the system sees, there will be between 10 and 100 selects run against it and associated data. It is okay for updates and inserts to run slow, so long as queries run quick and are as up-to-date as possible. The application continues to grow in size and scope and is already starting to run slower than I like. I am worried that we have about outgrown Pipeline Pilot, and perhaps Oracle (as the sole database). Would a NoSQL database or an OLAP system be appropriate here? What web application frameworks work well with systems like this? I'd like the solution to be something scalable, portable and supportable X-years down the road. Here is the current state of the application: Web Server/Data Processing: Pipeline Pilot on Windows Server + IIS Database: Oracle 10g, ~1TB of data, ~180 tables with several billion-plus row tables Network Storage: Isilon, ~50TB of low-priority raw data

Search Results

Search found 58653 results on 2347 pages for 'transactional data'.

Page 11/2347 | < Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18 | Next Page >

- by user2891213

- by Pinal Dave

- by dimo414

- by TriParkinson

- by Tom

- by user336685

- by Jason

- by Yktula

- by OscarTheGrouch

- by Luca Bartoletti

- by ANIL MANE

- by Boaz

- by Codeguru

- by Stephan Eggermont

- by Markus Lamm

- by John

- by pinaldave

- by jamiet

- by Sandrine Riley

- by stephenmm

- by Ian Boyd

- by ern

- by pinaldave

- by willOEM

< Previous Page | 7 8 9 10 11 12 13 14 15 16 17 18 | Next Page >