data compression - Page 15

Choices in Architecture, Design, Algorithms, Data Structures for effective RDF Reasoning and Querying in a Big Data Environment [on hold]

- by user2891213

As part of my academic project I would like to know what choices in Architecture, Design, Algorithms, Data Structures do we need in order to provide effective and efficient RDF Reasoning and Querying in a Big Data Environment. Basically I want to get info regarding below points: What are the Systems and Software to get appropriate Architecture? What kind of API layer(s) would we need on top of the Big Data stores, to make this possible? The Indexing structures we will need. The appropriate Algorithms, and appropriate Algorithms for Query Planning across Big Data stores. The Performance Analysis and Cost Models we will need to justify the design decisions we have made along the way. Can anyone please provide pointers.. Thanks, David

Read the article

Uploading your data using the Maps Data API just got easier

The Google Maps Data API is a great way to host your geographic data on Google’s scalable, high-performance servers, making your data accessible across platforms using HTTP or...

Read the article

Does ZFS cache Compressed or Uncompressed data in a ZFS file-system with compression turned on?

- by George Bailey

ZFS supports file-system compression and it also caches frequently or recently accessed data. If a system has lots of CPU but the underlying data storage system is slow. It is possible that ZFS would perform better with compression turned on. This can be easily tested when writing files by measuring CPU and disk usage and throughput. (of course latency may exist,, but this would not be an issue for large files). But what about cache? If data will have to be decompressed every time it is read then this is probably less of a good idea. Is the cached data compressed?. Does anybody have some information on this?

Read the article

Data Structure Behind Amazon S3s Keys (Filtering Data Structure)

- by dimo414

I'd like to implement a data structure similar to the lookup functionality of Amazon S3. For those of you who don't know what I'm taking about, Amazon S3 stores all files at the root, but allows you to look up groups of files by common prefixes in their names, therefore replicating the power of a directory tree without the complexity of it. The catch is, both lookup and filter operations are O(1) (or close enough that even on very large buckets - S3's disk equivalents - both operations might as well be O(1))). So in short, I'm looking for a data structure that functions like a hash map, with the added benefit of efficient (at the very least not O(n)) filtering. The best I can come up with is extending HashMap so that it also contains a (sorted) list of contents, and doing a binary search for the range that matches the prefix, and returning that set. This seems slow to me, but I can't think of any other way to do it. Does anyone know either how Amazon does it, or a better way to implement this data structure?

Read the article

Best Data Structure For Time Series Data

- by TriParkinson

Hi all, I wonder if someone could take a minute out of their day to give their two cents on my problem. I would like some suggestions on what would be the best data structure for representing, on disk, a large data set of time series data. The main priority is speed of insertion, with other priorities in decreasing order; speed of retrieval, size on disk, size in memory, speed of removal. I have seen that B+ trees are often used in database because of their fast search times, but how about for fast insertion times? Is a linked list really the way to go? Thanks in advance for your time, Tri

Read the article

MySQL: Blank row in table after LOAD DATA INFILE

- by Tom

Hi, I'm uploading a large amount of data from a CSV (I'm doing it via MySQL Workbench): LOAD DATA INFILE 'C:/development/mydoc.csv' INTO TABLE mydatabase.mytable CHARACTER SET utf8 FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r'; However, I'm noticing that it keeps adding an empty line full of nulls/zeros after the last record. I'm guessing it's because of the "LINES TERMINATED" command. However, I need that to load the data in correctly. Is there some way around this / some better SQL to avoid the blank row in the table? Thanks

Read the article

Update tableview instantly as data pushed in core data iphone

- by user336685

I need to update the tableview as soon as the content is pushed in core data database. for this AppDelegate.m contains following code NSManagedObjectContext *moc = [self managedObjectContext]; NSFetchRequest *request = [[NSFetchRequest alloc] init]; [request setEntity:[NSEntityDescription entityForName:@"FeedItem" inManagedObjectContext:moc]]; //for loop // push data in code data & then save context [moc save:&error]; ZAssert(error == nil, @"Error saving context: %@", [error localizedDescription]); //for loop ends This code triggers following code from RootviewController.m - (void)controllerWillChangeContent:(NSFetchedResultsController*)controller { [[self tableView] beginUpdates]; } But this updates the tableview only at the end of the for loop ,the table does not get updated after immediate push in db. I tried following code but that didn't work - (void)controllerDidChangeContent:(NSFetchedResultsController *)controller { // In the simplest, most efficient, case, reload the table view. [self.tableView reloadData]; } I have been stuck with this problem for several days.Please help.Thanks in advance for solution.

Read the article

Core Data data type for just the date - not including time

- by Jason

I am new at Core Data, and it seems like it is a great way to manage the data store. However I am also very memory-conscious due to the fact that the iPhone doesn't have that much of it. I was a little surprised to see that the data types are so limited - eg. there is a Date type which includes also the time, but no Date type for just the date! All the time information takes up precious bytes of memory, if I just wanted an attribute with the date (e.g. 2/15/2010 rather than 2/15/2010 02:34:48), how could I do this? Is it possible?

Read the article

Clever ways of implementing different data structures in C & data structures that should be used mor

- by Yktula

What are some clever (not ordinary) ways of implementing data structures in C, and what are some data structures that should be used more often? For example, what is the most effective way (generating minimal overhead) to implement a directed and cyclic graph with weighted edges in C? I know that we can store the distances in an array as is done here, but what other ways are there to implement this kind of a graph?

Read the article

Using GameKit to transfer CoreData data between iPhones, via NSDictionary

- by OscarTheGrouch

I have an application where I would like to exchange information, managed via Core Data, between two iPhones. First turning the Core Data object to an NSDictionary (something very simple that gets turned into NSData to be transferred). My CoreData has 3 string attributes, 2 image attributes that are transformables. I have looked through the NSDictionary API but have not had any luck with it, creating or adding the CoreData information to it. Any help or sample code regarding this would be greatly appreciated.

Read the article

MPI: is there mpi libraries capable of message compression?

- by osgx

Sometimes MPI is used to send low-entropy data in messages. So it can be useful to try to compress messages before sending it. I know that MPI can work on very fast networks (10 Gbit/s and more), but many MPI programs are used with cheap network like 0,1G or 1Gbit/s Ethernet and with cheap (slow, low bisection) network switch. There is a very fast Snappy (wikipedia) compression algorithm, which has Compression speed is 250 MB/s and decompression speed is 500 MB/s so on compressible data and slow network it will give some speedup. Is there any MPI library which can compress MPI messages (at layer of MPI; not the compression of ip packets like in PPP). MPI messages are also structured, so there can be some special method, like compression of exponent part in array of double.

Read the article

Good practices for multiple language data in Core Data

- by Luca Bartoletti

Hi, i need a multilingual coredata db in my iphone app. I could create different database for each language but i hope that in iphone sdk exist an automatically way to manage data in different language core data like for resources and string. Someone have some hints?

Read the article

core-data relationships and data structure.

- by Boaz

What is the right way to build iPhone core data for this SMS like app (with location)? - I want to represent an entity of conversation with "profile1" "profile2" that heritage from a profile entity, and a message entity with: "to" "from" "body" where the "to" and "from" are equal to "profile1" and/or "profile2" in the conversation entity. How can I make such a relationships? is there a better way to represent the data (other structure)? Thanks

Read the article

convert unstructured data to structured data?

- by Codeguru

How to convert unstructured data into structured data?For example email,contacts to a structured format. Are there any algorithms to do this??

Read the article

When is the Data Vault model the right model for a data-warehouse?

- by Stephan Eggermont

I recently found a reference to 'Data Vault modeling' as a model for data-warehouses. The models I've seen before are Inmon and Kimball. The author refers to possible performance problems due to the joins needed. It looks like a nice model, but I wonder about the gotcha's. Are there any experience reports on-line?

Read the article

Increase the compression performance of VPN

- by Martin

I am currently switching from a system with HPN-SSH tunnels and enabled compression to something VPN based. I have tried tinc and n2n so far, hamachi requires a library I do not have. In my primitive benchmarks I am not satisfied with the achievable bandwidth compared to the SSH tunnels. In tinc the low LZO setting performed best, but compression is only available in UDP mode. Ideally I would like to have a TCP-based VPN with a multi-threaded compression. Can you suggest me some ideas how to increase the performance? Would it be possible to somehow put a compression filter in front of the tun interface? Or are there any VPN implementations that might be better suited for my needs (fast compression, TCP-based, switch mode, does not have to be super-secure)? I would consider tunnelling Ethernet over SSH, but according to some articles it is not advisable.

Read the article

[Visual C++]Forcing memory alignment of variables/data-structures

- by John

I'm looking at using SSE and I gather aligning data on 16byte boundaries is recommended. There are two cases to consider: float data[4]; struct myystruct { float x,y,z,w; }; I'm not sure the first case can be done explicitly, though there's perhaps a compiler option I could use? In the second case I remember being able to control packing in old versions of GCC several years back, is this still possible?

Read the article

SQL SERVER – Data Pages in Buffer Pool – Data Stored in Memory Cache

- by pinaldave

This will drop all the clean buffers so we will be able to start again from there. Now, run the following script and check the execution plan of the query. Have you ever wondered what types of data are there in your cache? During SQL Server Trainings, I am usually asked if there is any way one can know how much data in a table is stored in the memory cache? The more detailed question I usually get is if there are multiple indexes on table (and used in a query), were the data of the single table stored multiple times in the memory cache or only for a single time? Here is a query you can run to figure out what kind of data is stored in the cache. USE AdventureWorks GO SELECT COUNT(*) AS cached_pages_count, name AS BaseTableName, IndexName, IndexTypeDesc FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN ( SELECT s_obj.name, s_obj.index_id, s_obj.allocation_unit_id, s_obj.OBJECT_ID, i.name IndexName, i.type_desc IndexTypeDesc FROM ( SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id ,allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3) UNION ALL SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id, allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2 ) AS s_obj LEFT JOIN sys.indexes i ON i.index_id = s_obj.index_id AND i.OBJECT_ID = s_obj.OBJECT_ID ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id WHERE database_id = DB_ID() GROUP BY name, index_id, IndexName, IndexTypeDesc ORDER BY cached_pages_count DESC; GO Now let us run the query above and observe the output of the same. We can see in the above query that there are four columns. Cached_Pages_Count lists the pages cached in the memory. BaseTableName lists the original base table from which data pages are cached. IndexName lists the name of the index from which pages are cached. IndexTypeDesc lists the type of index. Now, let us do one more experience here. Please note that you should not run this test on a production server as it can extremely reduce the performance of the database. DBCC DROPCLEANBUFFERS This will drop all the clean buffers and we will be able to start again from there. Now run following script and check the execution plan for the same. USE AdventureWorks GO SELECT UnitPrice, ModifiedDate FROM Sales.SalesOrderDetail WHERE SalesOrderDetailID BETWEEN 1 AND 100 GO The execution plans contain the usage of two different indexes. Now, let us run the script that checks the pages cached in SQL Server. It will give us the following output. It is clear from the Resultset that when more than one index is used, datapages related to both or all of the indexes are stored in Memory Cache separately. Let me know what you think of this article. I had a great pleasure while writing this article because I was able to write on this subject, which I like the most. In the next article, we will exactly see what data are cached and those that are not cached, using a few undocumented commands. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, SQL, SQL Authority, SQL Optimization, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology Tagged: SQL DMV

Read the article

Bitmask data insertions in SSDT Post-Deployment scripts

- by jamiet

On my current project we are using SQL Server Data Tools (SSDT) to manage our database schema and one of the tasks we need to do often is insert data into that schema once deployed; the typical method employed to do this is to leverage Post-Deployment scripts and that is exactly what we are doing. Our requirement is a little different though, our data is split up into various buckets that we need to selectively deploy on a case-by-case basis. I was going to use a SQLCMD variable for each bucket (defaulted to some value other than “Yes”) to define whether it should be deployed or not so we could use something like this in our Post-Deployment script: IF ($(DeployBucket1Flag) = 'Yes')BEGIN :r .\Bucket1.data.sqlENDIF ($(DeployBucket2Flag) = 'Yes')BEGIN :r .\Bucket2.data.sqlENDIF ($(DeployBucket3Flag) = 'Yes')BEGIN :r .\Bucket3.data.sqlEND That works fine and is, I’m sure, a very common technique for doing this. It is however slightly ugly because we have to litter our deployment with various SQLCMD variables. My colleague James Rowland-Jones (whom I’m sure many of you know) suggested another technique – bitmasks. I won’t go into detail about how this works (James has already done that at Using a Bitmask - a practical example) but I’ll summarise by saying that you can deploy different combinations of the buckets simply by supplying a different numerical value for a single SQLCMD variable. Each bit of that value’s binary representation signifies whether a particular bucket should be deployed or not. This is better demonstrated using the following simple script (which can be easily leveraged inside your Post-Deployment scripts): /* $(DeployData) is a SQLCMD variable that would, if you were using this in SSDT, be declared in the SQLCMD variables section of your project file. It should contain a numerical value, defaulted to 0. In this example I have declared it using a :setvar statement. Test the affect of different values by changing the :setvar statement accordingly. Examples: :setvar DeployData 1 will deploy bucket 1 :setvar DeployData 2 will deploy bucket 2 :setvar DeployData 3 will deploy buckets 1 & 2 :setvar DeployData 6 will deploy buckets 2 & 3 :setvar DeployData 31 will deploy buckets 1, 2, 3, 4 & 5 */ :setvar DeployData 0 DECLARE @bitmask VARBINARY(MAX) = CONVERT(VARBINARY,$(DeployData)); IF (@bitmask & 1 = 1) BEGIN PRINT 'Bucket 1 insertions'; END IF (@bitmask & 2 = 2) BEGIN PRINT 'Bucket 2 insertions'; END IF (@bitmask & 4 = 4) BEGIN PRINT 'Bucket 3 insertions'; END IF (@bitmask & 8 = 8) BEGIN PRINT 'Bucket 4 insertions'; END IF (@bitmask & 16 = 16) BEGIN PRINT 'Bucket 5 insertions'; END An example of running this using DeployData=6 The binary representation of 6 is 110. The second and third significant bits of that binary number are set to 1 and hence buckets 2 and 3 are “activated”. Hope that makes sense and is useful to some of you! @Jamiet P.S. I used the awesome HTML Copy feature of Visual Studio’s Productivity Power Tools in order to format the T-SQL code above for this blog post.

Read the article

Looking for Cutting-Edge Data Integration: 2014 Excellence Awards

- by Sandrine Riley

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 It is nomination time!!! This year's Oracle Fusion Middleware Excellence Awards will honor customers and partners who are creatively using various products across Oracle Fusion Middleware. Think you have something unique and innovative with one or a few of our Oracle Data Integration products? We would love to hear from you! Please submit today. The deadline for the nomination is June 20, 2014. What you win: An Oracle Fusion Middleware Innovation trophy One free pass to Oracle OpenWorld 2014 Priority consideration for placement in Profit magazine, Oracle Magazine, or other Oracle publications & press release Oracle Fusion Middleware Innovation logo for inclusion on your own Website and/or press release Let us reminisce a little… For details on the 2013 Data Integration Winners: Royal Bank of Scotland’s Market and International Banking and The Yalumba Wine Company, check out this blog post: 2013 Oracle Excellence Awards for Fusion Middleware Innovation… and the Winners for Data Integration are… and for details on the 2012 Data Integration Winners: Raymond James and Morrisons, check out this blog post: And the Winners of Fusion Middleware Innovation Awards in Data Integration are… Now to view the 2013 Winners (for all categories). We hope to honor you! Here's what you need to do: Click here to submit your nomination today. And just a reminder: the deadline to submit a nomination is 5pm Pacific Time on June 20, 2014. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

Read the article

Help with Perl persistent data storage using Data::Dumper

- by stephenmm

I have been trying to figure this out for way to long tonight. I have googled it to death and none of the examples or my hacks of the examples are getting it done. It seems like this should be pretty easy but I just cannot get it. Here is the code: #!/usr/bin/perl -w use strict; use Data::Dumper; my $complex_variable = {}; my $MEMORY = "$ENV{HOME}/data/memory-file"; $complex_variable->{ 'key' } = 'value'; $complex_variable->{ 'key1' } = 'value1'; $complex_variable->{ 'key2' } = 'value2'; $complex_variable->{ 'key3' } = 'value3'; print Dumper($complex_variable)."TEST001\n"; open M, ">$MEMORY" or die; print M Data::Dumper->Dump([$complex_variable], ['$complex_variable']); close M; $complex_variable = {}; print Dumper($complex_variable)."TEST002\n"; # Then later to restore the value, it's simply: do $MEMORY; #eval $MEMORY; print Dumper($complex_variable)."TEST003\n"; And here is my output: $VAR1 = { 'key2' => 'value2', 'key1' => 'value1', 'key3' => 'value3', 'key' => 'value' }; TEST001 $VAR1 = {}; TEST002 $VAR1 = {}; TEST003 Everything that I read says that the TEST003 output should look identical to the TEST001 output which is exactly what I am trying to achieve. What am I missing here? Should I be "do"ing differently or should I be "eval"ing instead and if so how? Thanks for any help...

Read the article

Relational database data explorer / visualization?

- by Ian Boyd

Is there a tool that can let one browse relational data as a graph of connected nodes? For example, i'm faced with trying to cleanse some anomolous data. i can start with two offending rows. In this particular example, the TransactionID should, by business rules, be unique to the table, but i find a transaction that violates that rule: SELECT * FROM LCTTrans WHERE TransactionID = 1075048 LCTID TransactionID ========= ============= 4358 1075048 4359 1075048 2 row(s) affected But really what i want to begin to hunt down all the related data, to try to see which is right. So this hypothetical software would start by showing me these two rows: Next, i want to see that transaction that is linked into this table: Now that transaction points to an MAL, so show me that: Now lets add those two LCTs, that the transaction is "on". A transaction can be on only one LCT, yet this one is pointing to two: Okay computer, both of those LCTs point to an MAL and the transaction that created them, show me those: Those last two transactions, they also point at an MAL, and they themselves point to an LCT, show me those: Okay, now are there any entries in LCTTrans that point to LCTs 4358 or 4359?... And so on, and so on. Now i did all this manually, running single selects, copying and pasting uniqueidentifier keys and converting them into friendly id numbers so i could easily see the relationships. Is there software that can do this?

Read the article

How to properly set relationships in Core Data when using setValue and data already exists

- by ern

Let's say I have two objects: Articles and Categories. For the sake of this example all relevant categories have already been added to the data store. When looping through data that holds edits for articles, there is category relationship information that needs to be saved. I was planning on using the -setValue method in the Article class in order to set the relationships like so: - (void)setValue:(id)value forUndefinedKey:(NSString *)key { if([key isEqualToString:@"categories"]){ NSLog(@"trying to set categories..."); } } The problem is that value isn't a Category, it is just a string (or array of strings) holding the title of a category. I could certainly do a lookup within this method for each category and assign it, but that seems inefficient when processing a whole bunch of articles at once. Another option is to populate an array of all possible categories and just filter, but my question is where to store that array? Should it be a class method on Article? Is there a way to pass in additional data to the -setValue method? Is there another, better option for setting the relationship I'm not thinking of? Thanks for your help.

Read the article

Is there a quality, file-size, or other benefit to JPEG sizes being multiples of 8px or 16px?

- by davebug

The JPEG compression encoding process splits a given image into blocks of 8x8 pixels, working with these blocks in future lossy and lossless compressions. [source] It is also mentioned that if the image is a multiple 1MCU block (defined as a Minimum Coded Unit, 'usually 16 pixels in both directions') that lossless alterations to a JPEG can be performed. [source] I am working with product images and would like to know both if, and how much benefit can be derived from using multiples of 16 in my final image size (say, using an image with size 480px by 360px) vs. a non-multiple of 16 (such as 484x362). In this example I am not interested in further alterations, editing, or recompression of the final image. To try to get closer to a specific answer where I know there must be largely generalities: Given a 480x360 image that is 64k and saved at maximum quality in Photoshop [example]: Can I expect any quality loss from an image that is 484x362 What amount of file size addition can I expect (for this example, the additional space would be white pixels) Are there any other disadvantages to growing larger than the 8px grid? I know it's arbitrary to use that specific example, but it would still be helpful (for me and potentially any others pondering an image size) to understand what level of compromise I'd be dealing with in breaking the non-8px grid. The key issue here is a debate I've had is whether 8-pixel divisible images are higher quality than images that are not divisible by 8-pixels.

Read the article

Creating transparent PNG with exact RGBA values

- by rrowland

I'm color-coding a transparent image to be read programatically. However, the image seems to be getting compressed and my code is reading color values other than the ones I mean to pass it. Concept This is the output I get, exporting as PNG-24. I programatically check each pixel for one of the six colors I use in creating the image: 0x00000F 0x0000F0 0x000F00 0x00F000 0x0F0000 0xF00000 Each color represents a different texture to apply. Top right (0x00000F) will pull texture from the tile to its top right and blend it at a ratio equal to the opacity of the pixel. The end goal is to create a hex tiled grid with differing textures that blend smoothly. What's happening It seems that when converting to PNG, Photoshop will change the RGBA to make it smoother, or just to help compression size. Parts that should be 250 red range anywhere from 150 to 255. Question Whether using PNG or another web-compatible format, I need to be able to save these pixel values, essentially instructions, loss-less and still maintain transparency. Is this possible in any format or will I need to re-think my approach?

Search Results

Search found 59256 results on 2371 pages for 'data compression'.

Page 15/2371 | < Previous Page | 11 12 13 14 15 16 17 18 19 20 21 22 | Next Page >

- by user2891213

- by George Bailey

- by dimo414

- by TriParkinson

- by Tom

- by user336685

- by Jason

- by Yktula

- by OscarTheGrouch

- by osgx

- by Luca Bartoletti

- by Boaz

- by Codeguru

- by Stephan Eggermont

- by Martin

- by John

- by pinaldave

- by jamiet

- by Sandrine Riley

- by stephenmm

- by Ian Boyd

- by ern

- by davebug

- by rrowland

< Previous Page | 11 12 13 14 15 16 17 18 19 20 21 22 | Next Page >