sample data - Page 150 - Developer IT

Improving performance of fuzzy string matching against a dictionary [closed]

- by Nathan Harmston

Hi, So I'm currently working for with using SecondString for fuzzy string matching, where I have a large dictionary to compare to (with each entry in the dictionary has an associated non-unique identifier). I am currently using a hashMap to store this dictionary. When I want to do fuzzy string matching, I first check to see if the string is in the hashMap and then I iterate through all of the other potential keys, calculating the string similarity and storing the k,v pair/s with the highest similarity. Depending on which dictionary I am using this can take a long time ( 12330 - 1800035 entries ). Is there any way to speed this up or make it faster? I am currently writing a memoization function/table as a way of speeding this up, but can anyone else think of a better way to improve the speed of this? Maybe a different structure or something else I'm missing. Many thanks in advance, Nathan

Read the article

Restore data from overwritten LVM

- by Matthias Bayer

I lost all of my data (8 TB) which I collected over the past few years yesterday because I made some seriuos mistakes during the remounting of my LVM. I run a XenServer5.6 installation with additional 4 harddisks for data storage. An LVM over those 4 HDDs was used to store all of my data. Yesterday, I reinstalled XenServer and wanted to mount my old Harddrives and add the LVM. I run xe sr-create [...] for all disks (/dev/sdb .. /dev/sde), but that was totally wrong. This command deletes the old LVM on the disks and created an new, empty lvm on every single disk with no partitions. No i got 4 empty harddrives :( Is it possible to recover some data from that lost LVM volumes? I have no clue how to do it because i deleted all informations about the old LVM. Is there a way to access the files insed that old lvm directly?

Read the article

12 és fél éve történt: Data Mart Suite és Discoverer/2000

- by Fekete Zoltán

Néhány hónapon belül az Oracle Hungary Kft. új irodaházba költözik. Érdemes tehát "inkrementálisan" selejtezni, ahogyan egy jó adattárházba is lépésenként kerülnek be az adatok, és témakörönként kisebb kilométerkövek mentén no a lefedett területek garmadája. :) Az imént akadt a kezembe egy jelentkezési lap az Oracle döntéstámogatás (DSS) témakörbol 1997-bol: Új döntésté(!)mogató eszközök a fejlesztok kezében, Oracle Partneri konferencia, 1997. november 7. :) Oldtimer... És mindez véletlenöl pontosan a NOSZF dátumára idozítve. Együtt ünnepelt a világ! Emlékszik még valaki, mi is a NOSZF feloldása? :) Azóta az Oracle Warehouse Builder és az Oracle Business Intelligence Enterprise Edition és a BI Standard Edition One lettek a zászlóshajók az ETL-ELT és az elemzés-kimutatáskészítés területen.

Read the article

Survey: How much data do you work with?

- by James Luetkehoelter

Andy isn't the only one that can ask a survey question. This is something I really curious about because many of the answers or recommendations or rants in blogs are not universably applicable to every database - small databases must sometimes be treated differently, and uber databases are just a pain (and fun at the same time). So, how would you classify most of the databases you work with: 1) Up to 50GB 2) 50-500GB 3) 500GB - 2TB 4) DEAR GOD THAT"S TOO MUCH INFORMATION! Share this post: email it!...(read more)

Read the article

silent failure while creating odbc data source

- by Peter

I just got really confused trying to create an ODBC data source in Windows 2003 R2. I can create a connection to my chosen server (a MS SQL Server) on the "user DSN" tab, but when I try to do the same thing on the "system DSN tab", the process fails but without an error message. I am able to connect to the target database fine at the end of configuring a new data source, but when I click OK, the data source just isn't there. No error message, no sign that anything went amiss, other than the lack of a new data source. Very annoying, as I had to repeat the process a few times to make sure I wasn't crazy. Anybody got any hints? I suspect it is a permission problem of some sort but since there is no error message, I don't know where to start.

Read the article

Graph Isomorphism > What kind of Graph is this?

- by oodavid

Essentially, this is a variation of Comparing Two Tree Structures, however I do not have "trees", but rather another type of graph. I need to know what kind of Graph I have in order to figure out if there's a Graph Isomorphism Special Case... As you can see, they are: Not Directed Not A Tree Cyclic Max 4 connections But I still don't know the correct terminology, nor the which Isomorphism algorithm to pursue, guidance appreciated.

Read the article

What is Granularity?

- by tonyrogerson

Granularity defines “the lowest level of detail”; but what is meant by “the lowest level of detail”? Consider the Transactions table below: create table Transactions ( TransactionID int not null primary key clustered, TransactionDate date not null, ClientID int not null, StockID int not null, TransactionAmount decimal ( 28 , 2 ) not null, CommissionAmount decimal ( 28 , 5 ) not null ) A Client can Trade in one or many Stocks on any date – there is no uniqueness to ClientID, Stock and TransactionDate...(read more)

Read the article

Hiding a file or data from being accessed unless on scheduled days [closed]

- by gkt.pro

Possible Duplicate: restricting access to volumes disk even for admin account windows How to restrict use of a computer? I want to limit my access to some data and what I want is that I should be able to access the data only on certain days of the month (e.g., every 3rd day). Is there any way like encryption or some utility to allow me to only access data on specific days? One idea that I was thinking of was to encrypt the data and store the password (will be complex and long so that I couldn't remember it right away) on some website which would then email me back the password in future on those specific days.

Read the article

T-SQL Tuesday #005 : SSRS Parameters and MDX Data Sets

- by blakmk

Well it this weeks T-SQL Tuesday #005 topic seems quite fitting. Having spent the past few weeks creating reports and dashboards in SSRS and SSAS 2008, I was frustrated by how difficult it is to use custom datasets to generate parameter drill downs. It also seems Reporting Services can be quite unforgiving when it comes to renaming things like datasets, so I want to share a couple of techniques that I found useful. One of the things I regularly do is to add parameters to the querys. However doing this causes Reporting Services to generate a hidden dataset and parameter name for you. One of the things I like to do is tweak these hidden datasets removing the ‘ALL’ level which is a tip I picked up from Devin Knight in his blog: There are some rules i’ve developed for myself since working with SSRS and MDX, they may not be the best or only way but they work for me. Rule 1 – Never trust the automatically generated hidden datasets Or even ANY, automatically generated MDX queries for that matter.... I’ve previously blogged about this here. If you examine the MDX generated in the hidden dataset you will see that it generates the MDX in the context of the originiating query by building a subcube, this mean it may NOT be appropriate to use this in a subsequent query which has a different context. Make sure you always understand what is going on. Often when i’m developing a dashboard or a report there are several parameter oriented datasets that I like to manually create. It can be that I have different datasets using the same dimension but in a different context. One example of this, is that I often use a dataset for last month and a dataset for the last 6 months. Both use the same date hierarchy. However Reporting Services seems not to be too smart when it comes to generating unique datasets when working with and renaming parameters and datasets. Very often I have come across this error when it comes to refactoring parameter names and default datasets. "an item with the same key has already been added" The only way I’ve found to reliably avoid this is to obey to rule 2. Rule 2 – Follow this sequence when it comes to working with Parameters and DataSets: 1. Create Lookup and Default Datasets in advance 2. Create parameters (set the datasets for available and default values) 3. Go into query and tick parameter check box 4. On dataset properties screen, select the parameter defined earlier from the parameter value defined earlier. Rule 3 – Dont tear your hair out when you have just renamed objects and your report doesn’t build Just use XML notepad on the original report file. I found I gained a good understanding of the structure of the underlying XML document just by using XML notepad. From this you can do a search and find references of the missing object. You can also just do a wholesale search and replace (after taking a backup copy of course ;-) So I hope the above help to save the sanity of anyone who regularly works with SSRS and MDX. @Blakmk

Read the article

Student Loan Guarantor Hit With Data Breach

After theft of a portable media device, federal student loan group Educational Credit Management Corp. notifies 3.3 million borrowers that their information is at risk.

Read the article

Google Fonts API JSON Data in WordPress Options-Framework-Theme

- by Rob

I'm developing a child-theme off of the new Twenty Twelve theme using Wordpress 3.4.2 and the development version of the Options Theme Framework by Devin Price. In Devin's tutorial, it shows of a way to implement 15 Google Web Fonts into the Theme Options page, but not all of them (roughly 560). I know I can create a "manual list", like in the tutorial that states each one with fallbacks, but this is time consuming and unproductive as Google may or may not add to, update, change or remove some of these fonts from their list. The list I've created above will ultimately store unavailable fonts the user thinks is there because of what they can see in the drop-down menu and it won't have any new ones - making the list and some selections obsolete. On the Google Developer API Web Fonts page, it talks briefly on retrieving a "dynamic list" using JSON/JavaScript. I was wondering how would I be able to pull the Google Web Fonts API into my Wordpress Theme Options page so I'm not creating my own list or have to constantly release an update to solve this issue. Could someone please walk me through what I would need to paste into my options.php, functions.php, /inc/options-framework.php file etc. or even in a new one to implement this? I've also had a look into some screencasts, plugins and tutorials on how it works, but none of them are specific enough for people just starting out. Please keep in mind I'm not the best coder... Thank you.

Read the article

A design pattern for data binding an object (with subclasses) to asp.net user control

- by Rohith Nair

I have an abstract class called Address and I am deriving three classes ; HomeAddress, Work Address, NextOfKin address. My idea is to bind this to a usercontrol and based on the type of Address it should bind properly to the ASP.NET user control. My idea is the user control doesn't know which address it is going to present and based on the type it will parse accordingly. How can I design such a setup, based on the fact that, the user control can take any type of address and bind accordingly. I know of one method like :- Declare class objects for all the three types (Home,Work,NextOfKin). Declare an enum to hold these types and based on the type of this enum passed to user control, instantiate the appropriate object based on setter injection. As a part of my generic design, I just created a class structure like this :- I know I am missing a lot of pieces in design. Can anybody give me an idea of how to approach this in proper way.

Read the article

Deploying SSIS to Integration Services Catalog (SSISDB) via SQL Server Data Tools

- by Kevin Shyr

There are quite a few good articles/blogs on this. For a straight forward deployment, read this (http://www.bibits.co/post/2012/08/23/SSIS-SQL-Server-2012-Project-Deployment.aspx). For a more dynamic and comprehensive understanding about all the different settings, read part 1 (http://www.mssqltips.com/sqlservertip/2450/ssis-package-deployment-model-in-sql-server-2012-part-1-of-2/) and part 2 (http://www.mssqltips.com/sqlservertip/2451/ssis-package-deployment-model-in-sql-server-2012-part-2-of-2/) Microsoft official doc: http://technet.microsoft.com/en-us/library/hh213373 This only thing I would add is the following. After your first deployment, you'll notice that the subsequent deployment skips the second step (go directly "Select Destination" and skipped "Select Source"). That's because after your initial deployment, a ispac file is created to track deployment. If you decide to go back to "Select Source" and select SSIS catalog again, the deployment process will complete, but the packages will not be deployed.

Read the article

xml file save/read error (making a highscore system for XNA game)

- by Eddy

i get an error after i write player name to the file for second or third time (An unhandled exception of type 'System.InvalidOperationException' occurred in System.Xml.dll Additional information: There is an error in XML document (18, 17).) (in highscores load method In data = (HighScoreData)serializer.Deserialize(stream); it stops) the problem is that some how it adds additional "" at the end of my .dat file could anyone tell me how to fix this? the file before save looks: <?xml version="1.0"?> <HighScoreData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <PlayerName> <string>neil</string> <string>shawn</string> <string>mark</string> <string>cindy</string> <string>sam</string> </PlayerName> <Score> <int>200</int> <int>180</int> <int>150</int> <int>100</int> <int>50</int> </Score> <Count>5</Count> </HighScoreData> the file after save looks: <?xml version="1.0"?> <HighScoreData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <PlayerName> <string>Nick</string> <string>Nick</string> <string>neil</string> <string>shawn</string> <string>mark</string> </PlayerName> <Score> <int>210</int> <int>210</int> <int>200</int> <int>180</int> <int>150</int> </Score> <Count>5</Count> </HighScoreData>> the part of my code that does all of save load to xml is: DECLARATIONS PART [Serializable] public struct HighScoreData { public string[] PlayerName; public int[] Score; public int Count; public HighScoreData(int count) { PlayerName = new string[count]; Score = new int[count]; Count = count; } } IAsyncResult result = null; bool inputName; HighScoreData data; int Score = 0; public string NAME; public string HighScoresFilename = "highscores.dat"; Game1 constructor public Game1() { graphics = new GraphicsDeviceManager(this); Content.RootDirectory = "Content"; Width = graphics.PreferredBackBufferWidth = 960; Height = graphics.PreferredBackBufferHeight =640; GamerServicesComponent GSC = new GamerServicesComponent(this); Components.Add(GSC); } Inicialize function (end of it) protected override void Initialize() { //other game code base.Initialize(); string fullpath =Path.Combine(HighScoresFilename); if (!File.Exists(fullpath)) { //If the file doesn't exist, make a fake one... // Create the data to save data = new HighScoreData(5); data.PlayerName[0] = "neil"; data.Score[0] = 200; data.PlayerName[1] = "shawn"; data.Score[1] = 180; data.PlayerName[2] = "mark"; data.Score[2] = 150; data.PlayerName[3] = "cindy"; data.Score[3] = 100; data.PlayerName[4] = "sam"; data.Score[4] = 50; SaveHighScores(data, HighScoresFilename); } } all methods for loading saving and output public static void SaveHighScores(HighScoreData data, string filename) { // Get the path of the save game string fullpath = Path.Combine("highscores.dat"); // Open the file, creating it if necessary FileStream stream = File.Open(fullpath, FileMode.OpenOrCreate); try { // Convert the object to XML data and put it in the stream XmlSerializer serializer = new XmlSerializer(typeof(HighScoreData)); serializer.Serialize(stream, data); } finally { // Close the file stream.Close(); } } /* Load highscores */ public static HighScoreData LoadHighScores(string filename) { HighScoreData data; // Get the path of the save game string fullpath = Path.Combine("highscores.dat"); // Open the file FileStream stream = File.Open(fullpath, FileMode.OpenOrCreate, FileAccess.Read); try { // Read the data from the file XmlSerializer serializer = new XmlSerializer(typeof(HighScoreData)); data = (HighScoreData)serializer.Deserialize(stream);//this is the line // where program gives an error } finally { // Close the file stream.Close(); } return (data); } /* Save player highscore when game ends */ private void SaveHighScore() { // Create the data to saved HighScoreData data = LoadHighScores(HighScoresFilename); int scoreIndex = -1; for (int i = 0; i < data.Count ; i++) { if (Score > data.Score[i]) { scoreIndex = i; break; } } if (scoreIndex > -1) { //New high score found ... do swaps for (int i = data.Count - 1; i > scoreIndex; i--) { data.PlayerName[i] = data.PlayerName[i - 1]; data.Score[i] = data.Score[i - 1]; } data.PlayerName[scoreIndex] = NAME; //Retrieve User Name Here data.Score[scoreIndex] = Score; // Retrieve score here SaveHighScores(data, HighScoresFilename); } } /* Iterate through data if highscore is called and make the string to be saved*/ public string makeHighScoreString() { // Create the data to save HighScoreData data2 = LoadHighScores(HighScoresFilename); // Create scoreBoardString string scoreBoardString = "Highscores:\n\n"; for (int i = 0; i<5;i++) { scoreBoardString = scoreBoardString + data2.PlayerName[i] + "-" + data2.Score[i] + "\n"; } return scoreBoardString; } when ill make this work i will start this code when i call game over (now i start it when i press some buttons, so i could test it faster) public void InputYourName() { if (result == null && !Guide.IsVisible) { string title = "Name"; string description = "Write your name in order to save your Score"; string defaultText = "Nick"; PlayerIndex playerIndex = new PlayerIndex(); result= Guide.BeginShowKeyboardInput(playerIndex, title, description, defaultText, null, null); // NAME = result.ToString(); } if (result != null && result.IsCompleted) { NAME = Guide.EndShowKeyboardInput(result); result = null; inputName = false; SaveHighScore(); } } this where i call output to the screen (ill call this in highscores meniu section when i am done with debugging) spriteBatch.DrawString(Font1, "" + makeHighScoreString(),new Vector2(500,200), Color.White); }

Read the article

SSRS Parameters and MDX Data Sets

- by blakmk

Having spent the past few weeks creating reports and dashboards in SSRS and SSAS 2008, I was frustrated by how difficult it is to use custom datasets to generate parameter drill downs. It also seems Reporting Services can be quite unforgiving when it comes to renaming things like datasets, so I want to share a couple of techniques that I found useful. One of the things I regularly do is to add parameters to the querys. However doing this causes Reporting Services to generate a hidden dataset and parameter name for you. One of the things I like to do is tweak these hidden datasets removing the ‘ALL’ level which is a tip I picked up from Devin Knight in his blog: There are some rules i’ve developed for myself since working with SSRS and MDX, they may not be the best or only way but they work for me. Rule 1 – Never trust the automatically generated hidden datasets Or even ANY, automatically generated MDX queries for that matter.... I’ve previously blogged about this here. If you examine the MDX generated in the hidden dataset you will see that it generates the MDX in the context of the originiating query by building a subcube, this mean it may NOT be appropriate to use this in a subsequent query which has a different context. Make sure you always understand what is going on. Often when i’m developing a dashboard or a report there are several parameter oriented datasets that I like to manually create. It can be that I have different datasets using the same dimension but in a different context. One example of this, is that I often use a dataset for last month and a dataset for the last 6 months. Both use the same date hierarchy. However Reporting Services seems not to be too smart when it comes to generating unique datasets when working with and renaming parameters and datasets. Very often I have come across this error when it comes to refactoring parameter names and default datasets. "an item with the same key has already been added" The only way I’ve found to reliably avoid this is to obey to rule 2. Rule 2 – Follow this sequence when it comes to working with Parameters and DataSets: 1. Create Lookup and Default Datasets in advance 2. Create parameters (set the datasets for available and default values) 3. Go into query and tick parameter check box 4. On dataset properties screen, select the parameter defined earlier from the parameter value defined earlier. Rule 3 – Dont tear your hair out when you have just renamed objects and your report doesn’t build Just use XML notepad on the original report file. I found I gained a good understanding of the structure of the underlying XML document just by using XML notepad. From this you can do a search and find references of the missing object. You can also just do a wholesale search and replace (after taking a backup copy of course ;-) So I hope the above help to save the sanity of anyone who regularly works with SSRS and MDX.

Read the article

Sharing and previewing Google Docs in Socialwok: Google Data APIs

Editor's note: Navin Kumar is CTO and co-founder of Socialwok, a feed-based group collaboration application for enterprises that integrates with Google Apps. With Socialwok, Google Apps users can...

Read the article

dnssec zonesigner ignoring out-of-zone data

- by jordi12100

I am trying to configure DNSSec with BIND9 on CentOS 6.4 running DirectAdmin control panel. I am using this tutorial to make it work: https://www.dnssec-tools.org/wiki/index.php/Zonesigner But I can't get it work... When I run this command: zonesigner --genkeys jordikroon.nl.db jordikroon.nl.db.signed I get this error: jordikroon.nl.db:17: ignoring out-of-zone data (jordikroon.nl) jordikroon.nl.db:18: ignoring out-of-zone data (jordikroon.nl) jordikroon.nl.db:22: ignoring out-of-zone data (jordikroon.nl) jordikroon.nl.db:29: ignoring out-of-zone data (jordikroon.nl) jordikroon.nl.db:33: ignoring out-of-zone data (jordikroon.nl) zone jordikroon.nl.db/IN: has no NS records zone jordikroon.nl.db/IN: not loaded due to errors. I can't find anything on the web about this error. This is my zone db file: $TTL 14400 @ IN SOA ns1.ghservers.org. hostmaster.jordikroon.nl. ( 2013090703 14400 3600 1209600 86400 ) jordikroon.nl. 14400 IN NS ns1.ghservers.org. jordikroon.nl. 14400 IN NS ns2.ghservers.org. cp 14400 IN A 85.17.32.228 ftp 14400 IN A 85.17.32.228 jordikroon.nl. 14400 IN A 85.17.32.228 localhost 14400 IN A 127.0.0.1 mail 14400 IN A 85.17.32.228 pop 14400 IN A 85.17.32.228 smtp 14400 IN A 85.17.32.228 www 14400 IN A 85.17.32.228 jordikroon.nl. 14400 IN MX 10 mail jordikroon.nl. 14400 IN TXT "v=spf1 a mx ip4:85.17.32.228 ~all" localhost 14400 IN AAAA ::1 How do I have to fix this? All IN keywords are being ignored. Any help is welcome:-)

Read the article

Windows Azure Emulators On Your Desktop

- by BuckWoody

Many people feel they have to set up a full Azure subscription online to try out and develop on Windows Azure. But you don’t have to do that right away. In fact, you can download the Windows Azure Compute Emulator – a “cloud development environment” – right on your desktop. No, it’s not for production use, and no, you won’t have other people using your system as a cloud provider, and yes, there are some differences with Production Windows Azure, but you’ll be able code, run, test, diagnose, watch, change and configure code without having any connection to the Internet at all. The best thing about this approach is that when you are ready to deploy the code you’ve been testing, a few clicks deploys it to your subscription when you make one. So what deep-magic does it take to run such a thing right on your laptop or even a Virtual PC? Well, it’s actually not all that difficult. You simply download and install the Windows Azure SDK (you can even get a free version of Visual Studio for it to run on – you’re welcome) from here: http://msdn.microsoft.com/en-us/windowsazure/cc974146.aspx This SDK will also install the Windows Azure Compute Emulator and the Windows Azure Storage Emulator – and then you’re all set. Right-click the icon for Visual Studio and select “Run as Administrator”: Now open a new “Cloud” type of project: Add your Web and Worker Roles that you want to code: And when you’re done with your design, press F5 to start the desktop version of Azure: Want to learn more about what’s happening underneath? Right-click the tray icon with the Azure logo, and select the two emulators to see what they are doing: In the configuration files, you’ll see a “Use Development Storage” setting. You can call the BLOB, Table or Queue storage and it will all run on your desktop. When you’re ready to deploy everything to Windows Azure, you simply change the configuration settings and add the storage keys and so on that you need. Want to learn more about all this? Overview of the Windows Azure Compute Emulator: http://msdn.microsoft.com/en-us/library/gg432968.aspx Overview of the Windows Azure Storage Emulator: http://msdn.microsoft.com/en-us/library/gg432983.aspx January 2011 Training Kit: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=413E88F8-5966-4A83-B309-53B7B77EDF78&displaylang=en

Read the article

SQL SERVER – SSIS Look Up Component – Cache Mode – Notes from the Field #028

- by Pinal Dave

[Notes from Pinal]: Lots of people think that SSIS is all about arranging various operations together in one logical flow. Well, the understanding is absolutely correct, but the implementation of the same is not as easy as it seems. Similarly most of the people think lookup component is just component which does look up for additional information and does not pay much attention to it. Due to the same reason they do not pay attention to the same and eventually get very bad performance. Linchpin People are database coaches and wellness experts for a data driven world. In this 28th episode of the Notes from the Fields series database expert Tim Mitchell (partner at Linchpin People) shares very interesting conversation related to how to write a good lookup component with Cache Mode. In SQL Server Integration Services, the lookup component is one of the most frequently used tools for data validation and completion. The lookup component is provided as a means to virtually join one set of data to another to validate and/or retrieve missing values. Properly configured, it is reliable and reasonably fast. Among the many settings available on the lookup component, one of the most critical is the cache mode. This selection will determine whether and how the distinct lookup values are cached during package execution. It is critical to know how cache modes affect the result of the lookup and the performance of the package, as choosing the wrong setting can lead to poorly performing packages, and in some cases, incorrect results. Full Cache The full cache mode setting is the default cache mode selection in the SSIS lookup transformation. Like the name implies, full cache mode will cause the lookup transformation to retrieve and store in SSIS cache the entire set of data from the specified lookup location. As a result, the data flow in which the lookup transformation resides will not start processing any data buffers until all of the rows from the lookup query have been cached in SSIS. The most commonly used cache mode is the full cache setting, and for good reason. The full cache setting has the most practical applications, and should be considered the go-to cache setting when dealing with an untested set of data. With a moderately sized set of reference data, a lookup transformation using full cache mode usually performs well. Full cache mode does not require multiple round trips to the database, since the entire reference result set is cached prior to data flow execution. There are a few potential gotchas to be aware of when using full cache mode. First, you can see some performance issues – memory pressure in particular – when using full cache mode against large sets of reference data. If the table you use for the lookup is very large (either deep or wide, or perhaps both), there’s going to be a performance cost associated with retrieving and caching all of that data. Also, keep in mind that when doing a lookup on character data, full cache mode will always do a case-sensitive (and in some cases, space-sensitive) string comparison even if your database is set to a case-insensitive collation. This is because the in-memory lookup uses a .NET string comparison (which is case- and space-sensitive) as opposed to a database string comparison (which may be case sensitive, depending on collation). There’s a relatively easy workaround in which you can use the UPPER() or LOWER() function in the pipeline data and the reference data to ensure that case differences do not impact the success of your lookup operation. Again, neither of these present a reason to avoid full cache mode, but should be used to determine whether full cache mode should be used in a given situation. Full cache mode is ideally useful when one or all of the following conditions exist: The size of the reference data set is small to moderately sized The size of the pipeline data set (the data you are comparing to the lookup table) is large, is unknown at design time, or is unpredictable Each distinct key value(s) in the pipeline data set is expected to be found multiple times in that set of data Partial Cache When using the partial cache setting, lookup values will still be cached, but only as each distinct value is encountered in the data flow. Initially, each distinct value will be retrieved individually from the specified source, and then cached. To be clear, this is a row-by-row lookup for each distinct key value(s). This is a less frequently used cache setting because it addresses a narrower set of scenarios. Because each distinct key value(s) combination requires a relational round trip to the lookup source, performance can be an issue, especially with a large pipeline data set to be compared to the lookup data set. If you have, for example, a million records from your pipeline data source, you have the potential for doing a million lookup queries against your lookup data source (depending on the number of distinct values in the key column(s)). Therefore, one has to be keenly aware of the expected row count and value distribution of the pipeline data to safely use partial cache mode. Using partial cache mode is ideally suited for the conditions below: The size of the data in the pipeline (more specifically, the number of distinct key column) is relatively small The size of the lookup data is too large to effectively store in cache The lookup source is well indexed to allow for fast retrieval of row-by-row values No Cache As you might guess, selecting no cache mode will not add any values to the lookup cache in SSIS. As a result, every single row in the pipeline data set will require a query against the lookup source. Since no data is cached, it is possible to save a small amount of overhead in SSIS memory in cases where key values are not reused. In the real world, I don’t see a lot of use of the no cache setting, but I can imagine some edge cases where it might be useful. As such, it’s critical to know your data before choosing this option. Obviously, performance will be an issue with anything other than small sets of data, as the no cache setting requires row-by-row processing of all of the data in the pipeline. I would recommend considering the no cache mode only when all of the below conditions are true: The reference data set is too large to reasonably be loaded into SSIS memory The pipeline data set is small and is not expected to grow There are expected to be very few or no duplicates of the key values(s) in the pipeline data set (i.e., there would be no benefit from caching these values) Conclusion The cache mode, an often-overlooked setting on the SSIS lookup component, represents an important design decision in your SSIS data flow. Choosing the right lookup cache mode directly impacts the fidelity of your results and the performance of package execution. Know how this selection impacts your ETL loads, and you’ll end up with more reliable, faster packages. If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: Notes from the Field, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL Tagged: SSIS

Read the article

Highlight Word add-in for Visual Studio 2010 [SSDT]

- by jamiet

I’ve just been alerted by my colleague Kyle Harvie to a Visual Studio 2010 add-in that should prove very useful if you are an SSDT user. Its simply called Highlight all occurrences of selected word and does exactly what it says on the tin, you highlight a word and it shows all other occurrences of that word in your script: There’s a limitation for .sql files (which I have reported) where the highlighting doesn’t work if the word is wrapped in square brackets but what the heck, its free, it takes about ten seconds to install….install it already! @Jamiet

Read the article

WCF Data Service Pipeline

- by Daniel Cazzulino

For documentation purposes, I just draw the following UML sequence diagrams for the “Astoria” pipeline, using Visual Studio 2010 Ultimate: For a single-entity (or non-batched) request, this is the sequence: For a batch request, this is the sequence instead: DataService component is your own DataService<T>-derived class, and DataService.ProcessingPipeline refers to its ProcessingPipeline property pipeline events. /kzu

Read the article

Microsoft Entourage/Exchange Server problem: all objects disappeared from server - still in some for

- by splattne

One of our employees works with Entourage on his MacBook Pro (OSX 10.6) accessing Exchange Server 2007. Last Friday morning, I think while working over a VPN, Entourage (I think it was Entourage) deleted all his objects (mail, calendar, contacts) on the server and while creating a lot of strange folders (starting with underscores) on the client. The local data seems to be there, but not in a consistent form. Since the user's mailbox is rather big, I suspect, that there was some kind of "move" operation which did not complete. I tried to export the data, but the export stops because of a corrupted object. Is there a tool or another way to export or retrieve the local data? Edit - FYI: we solved the problem getting his data from the previous night's backup.

Read the article

TSQL Challenge 29 - Reorganize sales data by sales amount

Your job is to reorganize the items in the table by sales amount (quantity * price). Reorganize the items from left to right, bottom to top. The item with the highest sales amount should come on top left of each invoice.

Read the article

Proper Data Structure for Commentable Comments

- by Wesley

Been struggling with this on an architectural level. I have an object which can be commented on, let's call it a Post. Every post has a unique ID. Now I want to comment on that Post, and I can use ID as a foreign key, and each PostComment has an ItemID field which correlates to the Post. Since each Post has a unique ID, it is very easy to assign "Top Level" comments. When I comment on a comment however, I feel like I now need a PostCommentComment, which attaches to the ID of the PostComment. Since ID's are assigned sequentially, I can no longer simply use ItemID to differentiate where in the tree the comment is assigned. I.E. both a Post and a Post Comment might have an ID of '5', so my foreign key relationship is invalid. This seems like it could go on infinitely, with PostCommentCommentComment's etc... What's the best way to solve this? Should I have a field in the comment called "IsPostComment" or something of the like to know which collection to attach the ID to? This strikes me as the best solution I've seen so far, but now I feel like I need to make recursive DataBase calls which start to get expensive. Meaning, I get a Post and get all PostComments where ItemID == Post.ID && where IsPostComment == true Then I take that as a collection, gather all the ID's of the PostComments, and do another search where ItemID == PostComment[all].ID && where IsPostComment == false, then repeat infinitely. This means I make a call for every layer, and if I'm calling 100 Posts, I might make 1000 DB calls to get 10 layers of comments each. What is the right way to do this?

Read the article

In the Cloud, Everything Costs Money

- by BuckWoody

I’ve been teaching my daughter about budgeting. I’ve explained that most of the time the money coming in is from only one or two sources – and you can only change that from time to time. The money going out, however, is to many locations, and it changes all the time. She’s made a simple debits and credits spreadsheet, and I’m having her research each part of the budget. Her eyes grow wide when she finds out everything has a cost – the house, gas for the lawnmower, dishes, water for showers, food, electricity to run the fridge, a new fridge when that one breaks, everything has a cost. She asked me “how do you pay for all this?” It’s a sentiment many adults have looking at their own budgets – and one reason that some folks don’t even make a budget. It’s hard to face up to the realities of how much it costs to do what we want to do. When we design a computing solution, it’s interesting to set up a similar budget, because we don’t always consider all of the costs associated with it. I’ve seen design sessions where the new software or servers are considered, but the “sunk” costs of personnel, networking, maintenance, increased storage, new sizes for backups and offsite storage and so on are not added in. They are already on premises, so they are assumed to be paid for already. When you move to a distributed architecture, you'll see more costs directly reflected. Store something, pay for that storage. If the system is deployed and no one is using it, you’re still paying for it. As you watch those costs rise, you might be tempted to think that a distributed architecture costs more than an on-premises one. And you might be right – for some solutions. I’ve worked with a few clients where moving to a distributed architecture doesn’t make financial sense – so we didn’t implement it. I still designed the system in a distributed fashion, however, so that when it does make sense there isn’t much re-architecting to do. In other cases, however, if you consider all of the on-premises costs and compare those accurately to operating a system in the cloud, the distributed system is much cheaper. Again, I never recommend that you take a “here-or-there-only” mentality – I think a hybrid distributed system is usually best – but each solution is different. There simply is no “one size fits all” to architecting a solution. As you design your solution, cost out each element. You might find that using a hybrid approach saves you money in one design and not in another. It’s a brave new world indeed. So yes, in the cloud, everything costs money. But an on-premises solution also costs money – it’s just that “dad” (the company) is paying for it and we don’t always see it. When we go out on our own in the cloud, we need to ensure that we consider all of the costs.

Search Results

Search found 63736 results on 2550 pages for 'sample data'.

Page 150/2550 | < Previous Page | 146 147 148 149 150 151 152 153 154 155 156 157 | Next Page >

- by Nathan Harmston

- by Matthias Bayer

- by Fekete Zoltán

- by James Luetkehoelter

- by Peter

- by oodavid

- by tonyrogerson

- by gkt.pro

- by blakmk

- by Rob

- by Rohith Nair

- by Kevin Shyr

- by Eddy

- by blakmk

- by jordi12100

- by BuckWoody

- by Pinal Dave

- by jamiet

- by Daniel Cazzulino

- by splattne

- by Wesley

- by BuckWoody

< Previous Page | 146 147 148 149 150 151 152 153 154 155 156 157 | Next Page >