Search Results

Search found 30671 results on 1227 pages for 'database modeling'.

Page 97/1227 | < Previous Page | 93 94 95 96 97 98 99 100 101 102 103 104  | Next Page >

  • is Payment table needed when you have an invoice table like this?

    - by EBAGHAKI
    this is my invoice table: Invoice Table: invoice_id creation_date due_date payment_date status enum('not paid','paid','expired') user_id total_price I wonder if it's Useful to have a payment table in order to record user payments for invoices. payment table can be like this: payment_id payment_date invoice_id price_paid status enum('successful', 'not successful')

    Read the article

  • How do I sync a subset of tables between two databases on the same mysql database server

    - by Mike
    would like to be able to sync a subset of tables between two mysql databases that are running on the same server. One of the databases acts as the master where inserts, updates and deletes can be made. The second database uses those same tables for read-only operations. I do not want to use federated tables to achieve this. The long term goal will be to separate the 2 databases to multiple servers, The second database that has the subset of tables as read-only may also be replicated a few times over to distribute geographically for load and performance purposes each with unqiue data.... Once that is achieved, I plan to use binlog to replicate those specific tables on the secondary databases. In the meantime, I'd like to keep these tables in sync. Is there a more elegant way to do this than other than using a cronjob and mysqldump?

    Read the article

  • Speeding up a group by date query on a big table in postgres

    - by zaius
    I've got a table with around 20 million rows. For arguments sake, lets say there are two columns in the table - an id and a timestamp. I'm trying to get a count of the number of items per day. Here's what I have at the moment. SELECT DATE(timestamp) AS day, COUNT(*) FROM actions WHERE DATE(timestamp) >= '20100101' AND DATE(timestamp) < '20110101' GROUP BY day; Without any indices, this takes about a 30s to run on my machine. Here's the explain analyze output: GroupAggregate (cost=675462.78..676813.42 rows=46532 width=8) (actual time=24467.404..32417.643 rows=346 loops=1) -> Sort (cost=675462.78..675680.34 rows=87021 width=8) (actual time=24466.730..29071.438 rows=17321121 loops=1) Sort Key: (date("timestamp")) Sort Method: external merge Disk: 372496kB -> Seq Scan on actions (cost=0.00..667133.11 rows=87021 width=8) (actual time=1.981..12368.186 rows=17321121 loops=1) Filter: ((date("timestamp") >= '2010-01-01'::date) AND (date("timestamp") < '2011-01-01'::date)) Total runtime: 32447.762 ms Since I'm seeing a sequential scan, I tried to index on the date aggregate CREATE INDEX ON actions (DATE(timestamp)); Which cuts the speed by about 50%. HashAggregate (cost=796710.64..796716.19 rows=370 width=8) (actual time=17038.503..17038.590 rows=346 loops=1) -> Seq Scan on actions (cost=0.00..710202.27 rows=17301674 width=8) (actual time=1.745..12080.877 rows=17321121 loops=1) Filter: ((date("timestamp") >= '2010-01-01'::date) AND (date("timestamp") < '2011-01-01'::date)) Total runtime: 17038.663 ms I'm new to this whole query-optimization business, and I have no idea what to do next. Any clues how I could get this query running faster?

    Read the article

  • IntegrityError with Booleand Fields and Postgresql

    - by xRobot
    I have this simple Blog model: class Blog(models.Model): title = models.CharField(_('title'), max_length=60, blank=True, null=True) body = models.TextField(_('body')) user = models.ForeignKey(User) is_public = models.BooleanField(_('is public'), default = True) When I insert a blog in admin interface, I get this error: IntegrityError at /admin/blogs/blog/add/ null value in column "is_public" violates not-null constraint Why ???

    Read the article

  • What does ON [PRIMARY] mean?

    - by Icono123
    I'm creating an SQL setup script and I'm using someone else's script as an example. Here's an example of the script: SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO CREATE TABLE [dbo].[be_Categories]( [CategoryID] [uniqueidentifier] ROWGUIDCOL NOT NULL CONSTRAINT [DF_be_Categories_CategoryID] DEFAULT (newid()), [CategoryName] [nvarchar](50) NULL, [Description] [nvarchar](200) NULL, [ParentID] [uniqueidentifier] NULL, CONSTRAINT [PK_be_Categories] PRIMARY KEY CLUSTERED ( [CategoryID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO Does anyone know what the ON [PRIMARY] command does? Regards.

    Read the article

  • Rails: Modeling an optional relation in ActiveRecord

    - by Hassinus
    I would like to map a relation between two Rails models, where one side can be optionnal. Let's me be more precise... I have two models: Profile that stores user profile information (name, age,...) and User model that stores user access to the application (email, password,...). To give you more information, User model is handled by Devise gem for signup/signin. Here is the scenario of my app: 1/ When a user register, a new row is created in User table and there is an equivalent in Profile table. This leads to the following script: class User < ActiveRecord::Base belongs_to :profile end 2/ A user can create it's profile without registering (kind of public profile with public information), so a row in Profile doesn't have necessarily a User row equivalent (here is the optional relation, the 0..1 relation in UML). Question: What is the corresponding script to put in class Profile < AR::Base to map optionally with User? Thanks in advance.

    Read the article

  • What is the proper design of storing temporary users? [closed]

    - by Mendy
    In SO site both real users and temporary users can add a new questions. I assume each user type has a different table. My question is how can I attach the question to the right user? I assuming the temp users have their own table from the following reasons: Temp users don't have all the data that real users have. like: email, password, and all users details. On the other hand, temp users are a lot more then real users. So it make more sense to have they in their own table.

    Read the article

  • which sql query is more efficient: select count(*) or select ... where key>value?

    - by davka
    I need to periodically update a local cache with new additions to some DB table. The table rows contain an auto-increment sequential number (SN) field. The cache keeps this number too, so basically I just need to fetch all rows with SN larger than the highest I already have. SELECT * FROM table where SN > <max_cached_SN> However, the majority of the attempts will bring no data (I just need to make sure that I have an absolutely up-to-date local copy). So I wander if this will be more efficient: count = SELECT count(*) from table; if (count > <cache_size>) // fetch new rows as above I suppose that selecting by an indexed numeric field is quite efficient, so I wander whether using count has benefit. On the other hand, this test/update will be done quite frequently and by many clients, so there is a motivation to optimize it.

    Read the article

  • replicating master tables mapping in transaction tables

    - by NoDisplay
    I have three master tables for location information Country {ID, Name} State {ID, Name, CountryID} City {ID, Name, StateID} Now I have one transcation table called Person which hold the person name and his location information. My Question is shall I have only CityID in the Person table like this: Person {ID, Name, CityID}' And have view of join query which give me detail like "Person{ID,Name,City,State,Country}" or Shall I replicate the mapping Person {ID, Name, CityID, StateID, CountryID} Please suggest which do you feel is to be selected and why? if there is any other option available, please suggest. Thanks in advance.

    Read the article

  • Getting deadlocks in MySQL

    - by at
    We're very frustratingly getting deadlocks in MySQL. It isn't because of exceeding a lock timeout as the deadlocks happen instantly when they do happen. Here's the SQL code that is executing on 2 separate threads (with 2 separate connections from the connection pool) that produces a deadlock: UPDATE Sequences SET Counter = LAST_INSERT_ID(Counter + 1) WHERE Sequence IS NULL Sequences table has 2 columns: Sequence and Counter The LAST_INSERT_ID allows us to retrieve this updated counter value as per MySQL's recommendation. That works perfect for us, but we get these deadlocks! Why are we getting them and how can we avoid them?? Thanks so much for any help with this.

    Read the article

  • Are there actually lag times to remove an email address from "the system"? [closed]

    - by Alex Gosselin
    For example, you send an unsubscribe message to a legitimate company or a spam, they reply that they will remove you and it may take up to 72 hours to take effect. I find it hard to believe anything that simple could take more than 3/4 of a second to take effect system wide. Another example would be when you call the visa activation line, there is a "delay" of several minutes while they try to sell you some kind of insurance. Usually just as you get the point across that you don't want it they will tell you your card has been activated and let you go. Are these delays real?

    Read the article

  • SQL SERVER – Shrinking NDF and MDF Files – Readers’ Opinion

    - by pinaldave
    Previously, I had written a blog post about SQL SERVER – Shrinking NDF and MDF Files – A Safe Operation. After that, I have written the following blog post that talks about the advantage and disadvantage of Shrinking and why one should not be Shrinking a file SQL SERVER – SHRINKFILE and TRUNCATE Log File in SQL Server 2008. On this subject, SQL Server Expert Imran Mohammed left an excellent comment. I just feel that his comment is worth a big article itself. For everybody to read his wonderful explanation, I am posting this blog post here. Thanks Imran! Shrinking Database always creates performance degradation and increases fragmentation in the database. I suggest that you keep that in mind before you start reading the following comment. If you are going to say Shrinking Database is bad and evil, here I am saying it first and loud. Now, the comment of Imran is written while keeping in mind only the process showing how the Shrinking Database Operation works. Imran has already explained his understanding and requests further explanation. I have removed the Best Practices section from Imran’s comments, as there are a few corrections. Comments from Imran - Before I explain to you the concept of Shrink Database, let us understand the concept of Database Files. When we create a new database inside the SQL Server, it is typical that SQl Server creates two physical files in the Operating System: one with .MDF Extension, and another with .LDF Extension. .MDF is called as Primary Data File. .LDF is called as Transactional Log file. If you add one or more data files to a database, the physical file that will be created in the Operating System will have an extension of .NDF, which is called as Secondary Data File; whereas, when you add one or more log files to a database, the physical file that will be created in the Operating System will have the same extension as .LDF. The questions now are, “Why does a new data file have a different extension (.NDF)?”, “Why is it called as a secondary data file?” and, “Why is .MDF file called as a primary data file?” Answers: Note: The following explanation is based on my limited knowledge of SQL Server, so experts please do comment. A data file with a .MDF extension is called a Primary Data File, and the reason behind it is that it contains Database Catalogs. Catalogs mean Meta Data. Meta Data is “Data about Data”. An example for Meta Data includes system objects that store information about other objects, except the data stored by the users. sysobjects stores information about all objects in that database. sysindexes stores information about all indexes and rows of every table in that database. syscolumns stores information about all columns that each table has in that database. sysusers stores how many users that database has. Although Meta Data stores information about other objects, it is not the transactional data that a user enters; rather, it’s a system data about the data. Because Primary Data File (.MDF) contains important information about the database, it is treated as a special file. It is given the name Primary Data file because it contains the Database Catalogs. This file is present in the Primary File Group. You can always create additional objects (Tables, indexes etc.) in the Primary data file (This file is present in the Primary File group), by mentioning that you want to create this object under the Primary File Group. Any additional data file that you add to the database will have only transactional data but no Meta Data, so that’s why it is called as the Secondary Data File. It is given the extension name .NDF so that the user can easily identify whether a specific data file is a Primary Data File or a Secondary Data File(s). There are many advantages of storing data in different files that are under different file groups. You can put your read only in the tables in one file (file group) and read-write tables in another file (file group) and take a backup of only the file group that has read the write data, so that you can avoid taking the backup of a read-only data that cannot be altered. Creating additional files in different physical hard disks also improves I/O performance. A real-time scenario where we use Files could be this one: Let’s say you have created a database called MYDB in the D-Drive which has a 50 GB space. You also have 1 Database File (.MDF) and 1 Log File on D-Drive and suppose that all of that 50 GB space has been used up and you do not have any free space left but you still want to add an additional space to the database. One easy option would be to add one more physical hard disk to the server, add new data file to MYDB database and create this new data file in a new hard disk then move some of the objects from one file to another, and put the file group under which you added new file as default File group, so that any new object that is created gets into the new files, unless specified. Now that we got a basic idea of what data files are, what type of data they store and why they are named the way they are, let’s move on to the next topic, Shrinking. First of all, I disagree with the Microsoft terminology for naming this feature as “Shrinking”. Shrinking, in regular terms, means to reduce the size of a file by means of compressing it. BUT in SQL Server, Shrinking DOES NOT mean compressing. Shrinking in SQL Server means to remove an empty space from database files and release the empty space either to the Operating System or to SQL Server. Let’s examine this through an example. Let’s say you have a database “MYDB” with a size of 50 GB that has a free space of about 20 GB, which means 30GB in the database is filled with data and the 20 GB of space is free in the database because it is not currently utilized by the SQL Server (Database); it is reserved and not yet in use. If you choose to shrink the database and to release an empty space to Operating System, and MIND YOU, you can only shrink the database size to 30 GB (in our example). You cannot shrink the database to a size less than what is filled with data. So, if you have a database that is full and has no empty space in the data file and log file (you don’t have an extra disk space to set Auto growth option ON), YOU CANNOT issue the SHRINK Database/File command, because of two reasons: There is no empty space to be released because the Shrink command does not compress the database; it only removes the empty space from the database files and there is no empty space. Remember, the Shrink command is a logged operation. When we perform the Shrink operation, this information is logged in the log file. If there is no empty space in the log file, SQL Server cannot write to the log file and you cannot shrink a database. Now answering your questions: (1) Q: What are the USEDPAGES & ESTIMATEDPAGES that appear on the Results Pane after using the DBCC SHRINKDATABASE (NorthWind, 10) ? A: According to Books Online (For SQL Server 2000): UsedPages: the number of 8-KB pages currently used by the file. EstimatedPages: the number of 8-KB pages that SQL Server estimates the file could be shrunk down to. Important Note: Before asking any question, make sure you go through Books Online or search on the Google once. The reasons for doing so have many advantages: 1. If someone else already has had this question before, chances that it is already answered are more than 50 %. 2. This reduces your waiting time for the answer. (2) Q: What is the difference between Shrinking the Database using DBCC command like the one above & shrinking it from the Enterprise Manager Console by Right-Clicking the database, going to TASKS & then selecting SHRINK Option, on a SQL Server 2000 environment? A: As far as my knowledge goes, there is no difference, both will work the same way, one advantage of using this command from query analyzer is, your console won’t be freezed. You can do perform your regular activities using Enterprise Manager. (3) Q: What is this .NDF file that is discussed above? I have never heard of it. What is it used for? Is it used by end-users, DBAs or the SERVER/SYSTEM itself? A: .NDF File is a secondary data file. You never heard of it because when database is created, SQL Server creates database by default with only 1 data file (.MDF) and 1 log file (.LDF) or however your model database has been setup, because a model database is a template used every time you create a new database using the CREATE DATABASE Command. Unless you have added an extra data file, you will not see it. This file is used by the SQL Server to store data which are saved by the users. Hope this information helps. I would like to as the experts to please comment if what I understand is not what the Microsoft guys meant. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Readers Contribution, Readers Question, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – NuoDB in Sixty Seconds – SQL in Sixty Seconds #053

    - by Pinal Dave
    Earlier this week, I have done five part blog series on NuoDB and it was very well received by audience. NuoDB is an elastically scalable SQL database that can run on local host, datacenter and cloud-based resources. t is an operational NewSQL database built on a patented emergent architecture with full support for SQL and ACID guarantees. In this blog post, I will explore how one can download and install NuoDB database. In this video I explain how one can install NuoDB in very few seconds and set up the entire environment in additional few seconds. One can get going with installation of NuoDB and sample database in total of less than 60 seconds. Let us see the same concept in following SQL in Sixty Seconds Video: You can Download NuoDB and reproduce the same Sixty Seconds experience. Related Tips in SQL in Sixty Seconds: Part 1 – Install NuoDB in 90 Seconds Part 2 – Manage NuoDB Installation Part 3 – Explore NuoDB Database Part 4 – Migrate from SQL Server to NuoDB Part 5 - NuoDB and Third Party Explorer What would you like to see in the next SQL in Sixty Seconds video? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Database, Pinal Dave, PostADay, SQL, SQL Authority, SQL in Sixty Seconds, SQL Interview Questions and Answers, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology, Video Tagged: Identity

    Read the article

  • Software Developer Interview Question - Fair or Unfair

    - by user607018
    I just phone interviewed with a company for a graduate software developer position and was asked the following questions. I should add that the company concerned are not a database vendor. How does a query optimiser work? If a database was performing badly how would you use the performance logs to find out the problem. I have asked whether they ask such questions of all candidate software developers (graduate or experienced) in a first phone interview. They replied that they like to test their candidates knowledge of database development. I want to write to the company to say that these questions are unreasonable to ask at a software developer interview and to request that my interview be done over. I would like to check the reasonableness of the following assumptions a) Those questions cannot be fairly classified as database development questions. b) I think the questions are appropriate for a DBA interview but wholly unreasonable for a software developer interview (experienced or not). c) The first question is only relevant to a database vendor. d) The second question is not fair because software developers typically don't deal with database performance logs as that is the job of the DBA. Perhaps some of you will be kind enough to comment on my assumptions or may have any other suggestions, before I write to the company.

    Read the article

  • Generic Repository with SQLite and SQL Compact Databases

    - by Andrew Petersen
    I am creating a project that has a mobile app (Xamarin.Android) using a SQLite database and a WPF application (Code First Entity Framework 5) using a SQL Compact database. This project will even eventually have a SQL Server database as well. Because of this I am trying to create a generic repository, so that I can pass in the correct context depending on which application is making the request. The issue I ran into is my DataContext for the SQL Compact database inherits from DbContext and the SQLite database inherits from SQLiteConnection. What is the best way to make this generic, so that it doesn't matter what kind of database is on the back end? This is what I have tried so far on the SQL Compact side: public interface IRepository<TEntity> { TEntity Add(TEntity entity); } public class Repository<TEntity, TContext> : IRepository<TEntity>, IDisposable where TEntity : class where TContext : DbContext { private readonly TContext _context; public Repository(DbContext dbContext) { _context = dbContext as TContext; } public virtual TEntity Add(TEntity entity) { return _context.Set<TEntity>().Add(entity); } } And on the SQLite side: public class ElverDatabase : SQLiteConnection { static readonly object Locker = new object(); public ElverDatabase(string path) : base(path) { CreateTable<Ticket>(); } public int Add<T>(T item) where T : IBusinessEntity { lock (Locker) { return Insert(item); } } }

    Read the article

  • Databases and the CI server

    - by mlk
    I have a CI server (Hudson) which merrily builds, runs unit tests and deploys to the development environment but I'd now like to get it running the integration tests. The integration tests will hit a database and that database will be consistently being changed to contain the data relevant to the test in question. This however leads to a problem - how do I make sure the database is not being splatted with data for one test and then that data being override by a second project before the first set of tests complete? I am current using the "hope" method, which is not working out too badly at the moment, but mostly due to the fact that we only have a small number of integration tests set up on CI. As I see it I have the following options: Test-local (in memory) databases I'm not sure if any in-memory databases handle all the scaryness of Oracles triggers and packages etc, and anything less I don't feel would be a worth while test. CI Executor-local databasesA fair amount of work would be needed to set this up and keep 'em up to date, but defiantly an option (most of the work is already done to keep the current CI database up-to-date). Single "integration test" executorLikely the easiest to implement, but would mean the integration tests could fall quite far behind. Locking the database (or set of tables) I'm sure I've missed some ways (please add them). How do you run database-based integration tests on the CI server? What issues have you had and what method do you recommend? (Note: While I use Hudson, I'm happy to accept answers for any CI server, the ideas I'm sure will be portable, even if the details are not). Cheers,      Mlk

    Read the article

  • One codebase - lots of hosted services (similar to a basecamp style service) - planning structure

    - by RickM
    We have built a service (PHP Based) for a client, and are now looking to offer it to other clients as a hosted service. For this example, think of it like a hosted forum service, where a client signs up on our site, and is given a subdomain or can use their own domain, and the code picks up the domain, checks it against a 'master' users table, and then loads the content as needed. I'm trying to work out the best way of handling multiple clients. At the moment I can only think of two options that would work: Option 1 - Have 1 set of database tables, but on each table have a column called 'siteid' - this would mean every query has to check the siteid. This would effectively work with just 1 codebase, and 1 database. Option 2 - Have 1 'master' database with all the core stuff such as the client details and their domain. Then when the systen checks the domain, it pulls the clients database details (username/password/dbname) from a table, and loads a second database. The issue here is security of the mysql server details, however it does have the benefit that they are running their own database instead of sharing one. Which option would I be better taking here, and why? Ideally I want it to be fairly easy to convert the 'standalone' script to the 'multi-domain' script as we're on a tight deadline.

    Read the article

  • 11g R2 on Windows???????!

    - by [email protected]
    Oracle Database?????????11g Release2(R2)?????????????Windows???????! ????Windows 7? Windows Server 2008 R2????????11g R2 for Windows????????OTN???????11gR2??????????????????????????????? ??????????????! OTN??????????????????????????????????????????????????????????????(?)????????????????????????????????????? http://www.oracle.com/technology/global/jp/software/products/database/index.html ? ???????????????????????????????????????????????????????????????? ??????????????????????? ???????! ??????????????????·??·??????????·??????·RAC????????????? GUI(Oracle Enterprise Manager)????????????···?????????????2??xx????? ??? http://download.oracle.com/docs/cd/E16338_01/index.htm????????????????????? ???????! ???PC?????????Oracle Direct Seminar?????????????????????????????11gR2?????????????????????? ???????????3???????? 2010?4?28?(?) 11:00~12:00 Windows 7 / Windows Server 2008 R2 ?????? Oracle ! 2010?5?26?(?) 11:00~12:00 ? ????DB??????????11g R2??? 2010?5?19?(?) 18:00~20:00 [Oracle Evening Seminar]Oracle Database 11g Release2?Windows Server 2008 R2/Windows7?????! 2010?5?22?(?) 13:00~17:00 [Oracle Weekend Seminar] ????????! Oracle Database ??? ????????????????????????????????! ????????????? ????????????????????? http://www.oracle.com/technology/global/jp/documentation/database.html OTN???????11gR2 ??????????????????????????????????????????????????????? ??? ????????????????????????????????????????????????????????????? ???????? ?????? ?????? FAQ ??????????????FAQ?????????? Coming Soon! 11gR2 on Windows ???????????????5???????????????????? ???????!???????????????????????????????????????··! ???????Oracle Database??????????????????OTN???????????????! ???????????????http://www.oracle.com/technology/global/jp/membership/index.html

    Read the article

  • How do I prove I should put a table of values in source code instead of a database table?

    - by FastAl
    <tldr>looking for a reference to a book or other undeniably authoritative source that gives reasons when you should choose a database vs. when you should choose other storage methods. I have provided an un-authoritative list of reasons about 2/3 of the way down this post.</tldr> I have a situation at my company where a database is being used where it would be better to use another solution (in this case, an auto-generated piece of source code that contains a static lookup table, searched by binary sort). Normally, a database would be an OK solution even though the problem does not require a database, e.g, none of the elements of ACID are needed, as it is read-only data, updated about every 3-5 years (also requiring other sourcecode changes), and fits in memory, and can be keyed into via binary search (a tad faster than db, but speed is not an issue). The problem is that this code runs on our enterprise server, but is shared with several PC platforms (some disconnected, some use a central DB, etc.), and parts of it are managed by multiple programming units, parts by the DBAs, parts even by mathematicians in another department, etc. These hit their own platform’s version of their databases (containing their own copy of the static data). What happens is that every implementation, every little change, something different goes wrong. There are many other issues as well. I can’t even use a flatfile, because one mode of running on our enterprise server does not have permission to read files (only databases, and of course, its own literal storage, e.g., in-source table). Of course, other parts of the system use databases in proper, less obscure manners; there is no problem with those parts. So why don’t we just change it? I don’t have administrative ability to force a change. But I’m affected because sometimes I have to help fix the problems, but mostly because it causes outages and tons of extra IT time by other programmers and d*mmit that makes me mad! The reason neither management, nor the designers of the system, can see the problem is that they propose a solution that won’t work: increase communication; implement more safeguards and standards; etc. But every time, in a different part of the already-pared-down but still multi-step processes, a few different diligent, hard-working, top performing IT personnel make a unique subtle error that causes it to fail, sometimes after the last round of testing! And in general these are not single-person failures, but understandable miscommunications. And communication at our company is actually better than most. People just don't think that's the case because they haven't dug into the matter. However, I have it on very good word from somebody with extensive formal study of sociology and psychology that the relatively small amount of less-than-proper database usage in this gigantic cross-platform multi-source, multi-language project is bureaucratically un-maintainable. Impossible. No chance. At least with Human Beings in the loop, and it can’t be automated. In addition, the management and developers who could change this, though intelligent and capable, don’t understand the rigidity of this ‘how humans are’ issue, and are not convincible on the matter. The reason putting the static data in sourcecode will solve the problem is, although the solution is less sexy than a database, it would function with no technical drawbacks; and since the sharing of sourcecode already works very well, you basically erase any database-related effort from this section of the project, along with all the drawbacks of it that are causing problems. OK, that’s the background, for the curious. I won’t be able to convince management that this is an unfixable sociological problem, and that the real solution is coding around these limits of human nature, just as you would code around a bug in a 3rd party component that you can’t change. So what I have to do is exploit the unsuitableness of the database solution, and not do it using logic, but rather authority. I am aware of many reasons, and posts on this site giving reasons for one over the other; I’m not looking for lists of reasons like these (although you can add a comment if I've miss a doozy): WHY USE A DATABASE? instead of flatfile/other DB vs. file: if you need... Random Read / Transparent search optimization Advanced / varied / customizable Searching and sorting capabilities Transaction/rollback Locks, semaphores Concurrency control / Shared users Security 1-many/m-m is easier Easy modification Scalability Load Balancing Random updates / inserts / deletes Advanced query Administrative control of design, etc. SQL / learning curve Debugging / Logging Centralized / Live Backup capabilities Cached queries / dvlp & cache execution plans Interleaved update/read Referential integrity, avoid redundant/missing/corrupt/out-of-sync data Reporting (from on olap or oltp db) / turnkey generation tools [Disadvantages:] Important to get right the first time - professional design - but only b/c it's meant to last s/w & h/w cost Usu. over a network, speed issue (best vs. best design vs. local=even then a separate process req's marshalling/netwk layers/inter-p comm) indicies and query processing can stand in the way of simple processing (vs. flatfile) WHY USE FLATFILE: If you only need... Sequential Row processing only Limited usage append only (no reading, no master key/update) Only Update the record you're reading (fixed length recs only) Too big to fit into memory If Local disk / read-ahead network connection Portability / small system Email / cut & Paste / store as document by novice - simple format Low design learning curve but high cost later WHY USE IN-MEMORY/TABLE (tables, arrays, etc.): if you need... Processing a single db/ff record that was imported Known size of data Static data if hardcoding the table Narrow, unchanging use (e.g., one program or proc) -includes a class that will be shared, but encapsulates its data manipulation Extreme speed needed / high transaction frequency Random access - but search is dependent on implementation Following are some other posts about the topic: http://stackoverflow.com/questions/1499239/database-vs-flat-text-file-what-are-some-technical-reasons-for-choosing-one-over http://stackoverflow.com/questions/332825/are-flat-file-databases-any-good http://stackoverflow.com/questions/2356851/database-vs-flat-files http://stackoverflow.com/questions/514455/databases-vs-plain-text/514530 What I’d like to know is if anybody could recommend a hard, authoritative source containing these reasons. I’m looking for a paper book I can buy, or a reputable website with whitepapers about the issue (e.g., Microsoft, IBM), not counting the user-generated content on those sites. This will have a greater change to elicit a change that I’m looking for: less wasted programmer time, and more reliable programs. Thanks very much for your help. You win a prize for reading such a large post!

    Read the article

  • VS 2010 SP1 and SQL CE

    - by ScottGu
    Last month we released the Beta of VS 2010 Service Pack 1 (SP1).  You can learn more about the VS 2010 SP1 Beta from Jason Zander’s two blog posts about it, and from Scott Hanselman’s blog post that covers some of the new capabilities enabled with it.   You can download and install the VS 2010 SP1 Beta here. Last week I blogged about the new Visual Studio support for IIS Express that we are adding with VS 2010 SP1. In today’s post I’m going to talk about the new VS 2010 SP1 tooling support for SQL CE, and walkthrough some of the cool scenarios it enables.  SQL CE – What is it and why should you care? SQL CE is a free, embedded, database engine that enables easy database storage. No Database Installation Required SQL CE does not require you to run a setup or install a database server in order to use it.  You can simply copy the SQL CE binaries into the \bin directory of your ASP.NET application, and then your web application can use it as a database engine.  No setup or extra security permissions are required for it to run. You do not need to have an administrator account on the machine. Just copy your web application onto any server and it will work. This is true even of medium-trust applications running in a web hosting environment. SQL CE runs in-memory within your ASP.NET application and will start-up when you first access a SQL CE database, and will automatically shutdown when your application is unloaded.  SQL CE databases are stored as files that live within the \App_Data folder of your ASP.NET Applications. Works with Existing Data APIs SQL CE 4 works with existing .NET-based data APIs, and supports a SQL Server compatible query syntax.  This means you can use existing data APIs like ADO.NET, as well as use higher-level ORMs like Entity Framework and NHibernate with SQL CE.  This enables you to use the same data programming skills and data APIs you know today. Supports Development, Testing and Production Scenarios SQL CE can be used for development scenarios, testing scenarios, and light production usage scenarios.  With the SQL CE 4 release we’ve done the engineering work to ensure that SQL CE won’t crash or deadlock when used in a multi-threaded server scenario (like ASP.NET).  This is a big change from previous releases of SQL CE – which were designed for client-only scenarios and which explicitly blocked running in web-server environments.  Starting with SQL CE 4 you can use it in a web-server as well. There are no license restrictions with SQL CE.  It is also totally free. Easy Migration to SQL Server SQL CE is an embedded database – which makes it ideal for development, testing, and light-usage scenarios.  For high-volume sites and applications you’ll probably want to migrate your database to use SQL Server Express (which is free), SQL Server or SQL Azure.  These servers enable much better scalability, more development features (including features like Stored Procedures – which aren’t supported with SQL CE), as well as more advanced data management capabilities. We’ll ship migration tools that enable you to optionally take SQL CE databases and easily upgrade them to use SQL Server Express, SQL Server, or SQL Azure.  You will not need to change your code when upgrading a SQL CE database to SQL Server or SQL Azure.  Our goal is to enable you to be able to simply change the database connection string in your web.config file and have your application just work. New Tooling Support for SQL CE in VS 2010 SP1 VS 2010 SP1 includes much improved tooling support for SQL CE, and adds support for using SQL CE within ASP.NET projects for the first time.  With VS 2010 SP1 you can now: Create new SQL CE Databases Edit and Modify SQL CE Database Schema and Indexes Populate SQL CE Databases within Data Use the Entity Framework (EF) designer to create model layers against SQL CE databases Use EF Code First to define model layers in code, then create a SQL CE database from them, and optionally edit the DB with VS Deploy SQL CE databases to remote servers using Web Deploy and optionally convert them to full SQL Server databases You can take advantage of all of the above features from within both ASP.NET Web Forms and ASP.NET MVC based projects. Download You can enable SQL CE tooling support within VS 2010 by first installing VS 2010 SP1 (beta). Once SP1 is installed, you’ll also then need to install the SQL CE Tools for Visual Studio download.  This is a separate download that enables the SQL CE tooling support for VS 2010 SP1. Walkthrough of Two Scenarios In this blog post I’m going to walkthrough how you can take advantage of SQL CE and VS 2010 SP1 using both an ASP.NET Web Forms and an ASP.NET MVC based application. Specifically, we’ll walkthrough: How to create a SQL CE database using VS 2010 SP1, then use the EF4 visual designers in Visual Studio to construct a model layer from it, and then display and edit the data using an ASP.NET GridView control. How to use an EF Code First approach to define a model layer using POCO classes and then have EF Code-First “auto-create” a SQL CE database for us based on our model classes.  We’ll then look at how we can use the new VS 2010 SP1 support for SQL CE to inspect the database that was created, populate it with data, and later make schema changes to it.  We’ll do all this within the context of an ASP.NET MVC based application. You can follow the two walkthroughs below on your own machine by installing VS 2010 SP1 (beta) and then installing the SQL CE Tools for Visual Studio download (which is a separate download that enables SQL CE tooling support for VS 2010 SP1). Walkthrough 1: Create a SQL CE Database, Create EF Model Classes, Edit the Data with a GridView This first walkthrough will demonstrate how to create and define a SQL CE database within an ASP.NET Web Form application.  We’ll then build an EF model layer for it and use that model layer to enable data editing scenarios with an <asp:GridView> control. Step 1: Create a new ASP.NET Web Forms Project We’ll begin by using the File->New Project menu command within Visual Studio to create a new ASP.NET Web Forms project.  We’ll use the “ASP.NET Web Application” project template option so that it has a default UI skin implemented: Step 2: Create a SQL CE Database Right click on the “App_Data” folder within the created project and choose the “Add->New Item” menu command: This will bring up the “Add Item” dialog box.  Select the “SQL Server Compact 4.0 Local Database” item (new in VS 2010 SP1) and name the database file to create “Store.sdf”: Note that SQL CE database files have a .sdf filename extension. Place them within the /App_Data folder of your ASP.NET application to enable easy deployment. When we clicked the “Add” button above a Store.sdf file was added to our project: Step 3: Adding a “Products” Table Double-clicking the “Store.sdf” database file will open it up within the Server Explorer tab.  Since it is a new database there are no tables within it: Right click on the “Tables” icon and choose the “Create Table” menu command to create a new database table.  We’ll name the new table “Products” and add 4 columns to it.  We’ll mark the first column as a primary key (and make it an identify column so that its value will automatically increment with each new row): When we click “ok” our new Products table will be created in the SQL CE database. Step 4: Populate with Data Once our Products table is created it will show up within the Server Explorer.  We can right-click it and choose the “Show Table Data” menu command to edit its data: Let’s add a few sample rows of data to it: Step 5: Create an EF Model Layer We have a SQL CE database with some data in it – let’s now create an EF Model Layer that will provide a way for us to easily query and update data within it. Let’s right-click on our project and choose the “Add->New Item” menu command.  This will bring up the “Add New Item” dialog – select the “ADO.NET Entity Data Model” item within it and name it “Store.edmx” This will add a new Store.edmx item to our solution explorer and launch a wizard that allows us to quickly create an EF model: Select the “Generate From Database” option above and click next.  Choose to use the Store.sdf SQL CE database we just created and then click next again.  The wizard will then ask you what database objects you want to import into your model.  Let’s choose to import the “Products” table we created earlier: When we click the “Finish” button Visual Studio will open up the EF designer.  It will have a Product entity already on it that maps to the “Products” table within our SQL CE database: The VS 2010 SP1 EF designer works exactly the same with SQL CE as it does already with SQL Server and SQL Express.  The Product entity above will be persisted as a class (called “Product”) that we can programmatically work against within our ASP.NET application. Step 6: Compile the Project Before using your model layer you’ll need to build your project.  Do a Ctrl+Shift+B to compile the project, or use the Build->Build Solution menu command. Step 7: Create a Page that Uses our EF Model Layer Let’s now create a simple ASP.NET Web Form that contains a GridView control that we can use to display and edit the our Products data (via the EF Model Layer we just created). Right-click on the project and choose the Add->New Item command.  Select the “Web Form from Master Page” item template, and name the page you create “Products.aspx”.  Base the master page on the “Site.Master” template that is in the root of the project. Add an <h2>Products</h2> heading the new Page, and add an <asp:gridview> control within it: Then click the “Design” tab to switch into design-view. Select the GridView control, and then click the top-right corner to display the GridView’s “Smart Tasks” UI: Choose the “New data source…” drop down option above.  This will bring up the below dialog which allows you to pick your Data Source type: Select the “Entity” data source option – which will allow us to easily connect our GridView to the EF model layer we created earlier.  This will bring up another dialog that allows us to pick our model layer: Select the “StoreEntities” option in the dropdown – which is the EF model layer we created earlier.  Then click next – which will allow us to pick which entity within it we want to bind to: Select the “Products” entity in the above dialog – which indicates that we want to bind against the “Product” entity class we defined earlier.  Then click the “Enable automatic updates” checkbox to ensure that we can both query and update Products.  When you click “Finish” VS will wire-up an <asp:EntityDataSource> to your <asp:GridView> control: The last two steps we’ll do will be to click the “Enable Editing” checkbox on the Grid (which will cause the Grid to display an “Edit” link on each row) and (optionally) use the Auto Format dialog to pick a UI template for the Grid. Step 8: Run the Application Let’s now run our application and browse to the /Products.aspx page that contains our GridView.  When we do so we’ll see a Grid UI of the Products within our SQL CE database. Clicking the “Edit” link for any of the rows will allow us to edit their values: When we click “Update” the GridView will post back the values, persist them through our EF Model Layer, and ultimately save them within our SQL CE database. Learn More about using EF with ASP.NET Web Forms Read this tutorial series on the http://asp.net site to learn more about how to use EF with ASP.NET Web Forms.  The tutorial series uses SQL Express as the database – but the nice thing is that all of the same steps/concepts can also now also be done with SQL CE.   Walkthrough 2: Using EF Code-First with SQL CE and ASP.NET MVC 3 We used a database-first approach with the sample above – where we first created the database, and then used the EF designer to create model classes from the database.  In addition to supporting a designer-based development workflow, EF also enables a more code-centric option which we call “code first development”.  Code-First Development enables a pretty sweet development workflow.  It enables you to: Define your model objects by simply writing “plain old classes” with no base classes or visual designer required Use a “convention over configuration” approach that enables database persistence without explicitly configuring anything Optionally override the convention-based persistence and use a fluent code API to fully customize the persistence mapping Optionally auto-create a database based on the model classes you define – allowing you to start from code first I’ve done several blog posts about EF Code First in the past – I really think it is great.  The good news is that it also works very well with SQL CE. The combination of SQL CE, EF Code First, and the new VS tooling support for SQL CE, enables a pretty nice workflow.  Below is a simple example of how you can use them to build a simple ASP.NET MVC 3 application. Step 1: Create a new ASP.NET MVC 3 Project We’ll begin by using the File->New Project menu command within Visual Studio to create a new ASP.NET MVC 3 project.  We’ll use the “Internet Project” template so that it has a default UI skin implemented: Step 2: Use NuGet to Install EFCodeFirst Next we’ll use the NuGet package manager (automatically installed by ASP.NET MVC 3) to add the EFCodeFirst library to our project.  We’ll use the Package Manager command shell to do this.  Bring up the package manager console within Visual Studio by selecting the View->Other Windows->Package Manager Console menu command.  Then type: install-package EFCodeFirst within the package manager console to download the EFCodeFirst library and have it be added to our project: When we enter the above command, the EFCodeFirst library will be downloaded and added to our application: Step 3: Build Some Model Classes Using a “code first” based development workflow, we will create our model classes first (even before we have a database).  We create these model classes by writing code. For this sample, we will right click on the “Models” folder of our project and add the below three classes to our project: The “Dinner” and “RSVP” model classes above are “plain old CLR objects” (aka POCO).  They do not need to derive from any base classes or implement any interfaces, and the properties they expose are standard .NET data-types.  No data persistence attributes or data code has been added to them.   The “NerdDinners” class derives from the DbContext class (which is supplied by EFCodeFirst) and handles the retrieval/persistence of our Dinner and RSVP instances from a database. Step 4: Listing Dinners We’ve written all of the code necessary to implement our model layer for this simple project.  Let’s now expose and implement the URL: /Dinners/Upcoming within our project.  We’ll use it to list upcoming dinners that happen in the future. We’ll do this by right-clicking on our “Controllers” folder and select the “Add->Controller” menu command.  We’ll name the Controller we want to create “DinnersController”.  We’ll then implement an “Upcoming” action method within it that lists upcoming dinners using our model layer above.  We will use a LINQ query to retrieve the data and pass it to a View to render with the code below: We’ll then right-click within our Upcoming method and choose the “Add-View” menu command to create an “Upcoming” view template that displays our dinners.  We’ll use the “empty” template option within the “Add View” dialog and write the below view template using Razor: Step 4: Configure our Project to use a SQL CE Database We have finished writing all of our code – our last step will be to configure a database connection-string to use. We will point our NerdDinners model class to a SQL CE database by adding the below <connectionString> to the web.config file at the top of our project: EF Code First uses a default convention where context classes will look for a connection-string that matches the DbContext class name.  Because we created a “NerdDinners” class earlier, we’ve also named our connectionstring “NerdDinners”.  Above we are configuring our connection-string to use SQL CE as the database, and telling it that our SQL CE database file will live within the \App_Data directory of our ASP.NET project. Step 5: Running our Application Now that we’ve built our application, let’s run it! We’ll browse to the /Dinners/Upcoming URL – doing so will display an empty list of upcoming dinners: You might ask – but where did it query to get the dinners from? We didn’t explicitly create a database?!? One of the cool features that EF Code-First supports is the ability to automatically create a database (based on the schema of our model classes) when the database we point it at doesn’t exist.  Above we configured  EF Code-First to point at a SQL CE database in the \App_Data\ directory of our project.  When we ran our application, EF Code-First saw that the SQL CE database didn’t exist and automatically created it for us. Step 6: Using VS 2010 SP1 to Explore our newly created SQL CE Database Click the “Show all Files” icon within the Solution Explorer and you’ll see the “NerdDinners.sdf” SQL CE database file that was automatically created for us by EF code-first within the \App_Data\ folder: We can optionally right-click on the file and “Include in Project" to add it to our solution: We can also double-click the file (regardless of whether it is added to the project) and VS 2010 SP1 will open it as a database we can edit within the “Server Explorer” tab of the IDE. Below is the view we get when we double-click our NerdDinners.sdf SQL CE file.  We can drill in to see the schema of the Dinners and RSVPs tables in the tree explorer.  Notice how two tables - Dinners and RSVPs – were automatically created for us within our SQL CE database.  This was done by EF Code First when we accessed the NerdDinners class by running our application above: We can right-click on a Table and use the “Show Table Data” command to enter some upcoming dinners in our database: We’ll use the built-in editor that VS 2010 SP1 supports to populate our table data below: And now when we hit “refresh” on the /Dinners/Upcoming URL within our browser we’ll see some upcoming dinners show up: Step 7: Changing our Model and Database Schema Let’s now modify the schema of our model layer and database, and walkthrough one way that the new VS 2010 SP1 Tooling support for SQL CE can make this easier.  With EF Code-First you typically start making database changes by modifying the model classes.  For example, let’s add an additional string property called “UrlLink” to our “Dinner” class.  We’ll use this to point to a link for more information about the event: Now when we re-run our project, and visit the /Dinners/Upcoming URL we’ll see an error thrown: We are seeing this error because EF Code-First automatically created our database, and by default when it does this it adds a table that helps tracks whether the schema of our database is in sync with our model classes.  EF Code-First helpfully throws an error when they become out of sync – making it easier to track down issues at development time that you might otherwise only find (via obscure errors) at runtime.  Note that if you do not want this feature you can turn it off by changing the default conventions of your DbContext class (in this case our NerdDinners class) to not track the schema version. Our model classes and database schema are out of sync in the above example – so how do we fix this?  There are two approaches you can use today: Delete the database and have EF Code First automatically re-create the database based on the new model class schema (losing the data within the existing DB) Modify the schema of the existing database to make it in sync with the model classes (keeping/migrating the data within the existing DB) There are a couple of ways you can do the second approach above.  Below I’m going to show how you can take advantage of the new VS 2010 SP1 Tooling support for SQL CE to use a database schema tool to modify our database structure.  We are also going to be supporting a “migrations” feature with EF in the future that will allow you to automate/script database schema migrations programmatically. Step 8: Modify our SQL CE Database Schema using VS 2010 SP1 The new SQL CE Tooling support within VS 2010 SP1 makes it easy to modify the schema of our existing SQL CE database.  To do this we’ll right-click on our “Dinners” table and choose the “Edit Table Schema” command: This will bring up the below “Edit Table” dialog.  We can rename, change or delete any of the existing columns in our table, or click at the bottom of the column listing and type to add a new column.  Below I’ve added a new “UrlLink” column of type “nvarchar” (since our property is a string): When we click ok our database will be updated to have the new column and our schema will now match our model classes. Because we are manually modifying our database schema, there is one additional step we need to take to let EF Code-First know that the database schema is in sync with our model classes.  As i mentioned earlier, when a database is automatically created by EF Code-First it adds a “EdmMetadata” table to the database to track schema versions (and hash our model classes against them to detect mismatches between our model classes and the database schema): Since we are manually updating and maintaining our database schema, we don’t need this table – and can just delete it: This will leave us with just the two tables that correspond to our model classes: And now when we re-run our /Dinners/Upcoming URL it will display the dinners correctly: One last touch we could do would be to update our view to check for the new UrlLink property and render a <a> link to it if an event has one: And now when we refresh our /Dinners/Upcoming we will see hyperlinks for the events that have a UrlLink stored in the database: Summary SQL CE provides a free, embedded, database engine that you can use to easily enable database storage.  With SQL CE 4 you can now take advantage of it within ASP.NET projects and applications (both Web Forms and MVC). VS 2010 SP1 provides tooling support that enables you to easily create, edit and modify SQL CE databases – as well as use the standard EF designer against them.  This allows you to re-use your existing skills and data knowledge while taking advantage of an embedded database option.  This is useful both for small applications (where you don’t need the scalability of a full SQL Server), as well as for development and testing scenarios – where you want to be able to rapidly develop/test your application without having a full database instance.  SQL CE makes it easy to later migrate your data to a full SQL Server or SQL Azure instance if you want to – without having to change any code in your application.  All we would need to change in the above two scenarios is the <connectionString> value within the web.config file in order to have our code run against a full SQL Server.  This provides the flexibility to scale up your application starting from a small embedded database solution as needed. Hope this helps, Scott P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu

    Read the article

  • SQL SERVER – Quick Note of Database Mirroring

    - by pinaldave
    Just a day ago, I was invited at Round Table meeting at prestigious organization. They were planning to implement High Availability solution using Database Mirroring. During the meeting, I have made few notes of what was being discussed there. I just thought it would be interested for all of you know about it. Database Mirroring works [...]

    Read the article

  • Ghost Records, Backups, and Database Compression…With a Pinch of Security Considerations

    - by Argenis
      Today Jeffrey Langdon (@jlangdon) posed on #SQLHelp the following questions: So I set to answer his question, and I said to myself: “Hey, I haven’t blogged in a while, how about I blog about this particular topic?”. Thus, this post was born. (If you have never heard of Ghost Records and/or the Ghost Cleanup Task, go see this blog post by Paul Randal) 1) Do ghost records get copied over in a backup? If you guessed yes, you guessed right. The backup process in SQL Server takes all data as it is on disk – it doesn’t crack the pages open to selectively pick which slots have actual data and which ones do not. The whole page is backed up, regardless of its contents. Even if ghost cleanup has run and processed the ghost records, the slots are not overwritten immediately, but rather until another DML operation comes along and uses them. As a matter of fact, all of the allocated space for a database will be included in a full backup. So, this poses a bit of a security/compliance problem for some of you DBA folk: if you want to take a full backup of a database after you’ve purged sensitive data, you should rebuild all of your indexes (with FILLFACTOR set to 100%). But the empty space on your data file(s) might still contain sensitive data! A SHRINKFILE might help get rid of that (not so) empty space, but that might not be the end of your troubles. You might _STILL_ have (not so) empty space on your files! One approach that you can follow is to export all of the data on your database to another SQL Server instance that does NOT have Instant File Initialization enabled. This can be a tedious and time-consuming process, though. So you have to weigh in your options and see what makes sense for you. Snapshot Replication is another idea that comes to mind. 2) Does Compression get rid of ghost records (2008)? The answer to this is no. The Ghost Records/Ghost Cleanup Task mechanism is alive and well on compressed tables and indexes. You can prove this running a simple script: CREATE DATABASE GhostRecordsTest GO USE GhostRecordsTest GO CREATE TABLE myTable (myPrimaryKey int IDENTITY(1,1) PRIMARY KEY CLUSTERED,                       myWideColumn varchar(1000) NOT NULL DEFAULT 'Default string value')                         ALTER TABLE myTable REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) GO INSERT INTO myTable DEFAULT VALUES GO 10 DELETE myTable WHERE myPrimaryKey % 2 = 0 DBCC TRACEON(2514) DBCC CHECKTABLE(myTable) TraceFlag 2514 will make DBCC CHECKTABLE give you an extra tidbit of information on its output. For the above script: “Ghost Record count = 5” Until next time,   -Argenis

    Read the article

  • Ghost Records, Backups, and Database Compression…With a Pinch of Security Considerations

    - by Argenis
      Today Jeffrey Langdon (@jlangdon) posed on #SQLHelp the following questions: So I set to answer his question, and I said to myself: “Hey, I haven’t blogged in a while, how about I blog about this particular topic?”. Thus, this post was born. (If you have never heard of Ghost Records and/or the Ghost Cleanup Task, go see this blog post by Paul Randal) 1) Do ghost records get copied over in a backup? If you guessed yes, you guessed right. The backup process in SQL Server takes all data as it is on disk – it doesn’t crack the pages open to selectively pick which slots have actual data and which ones do not. The whole page is backed up, regardless of its contents. Even if ghost cleanup has run and processed the ghost records, the slots are not overwritten immediately, but rather until another DML operation comes along and uses them. As a matter of fact, all of the allocated space for a database will be included in a full backup. So, this poses a bit of a security/compliance problem for some of you DBA folk: if you want to take a full backup of a database after you’ve purged sensitive data, you should rebuild all of your indexes (with FILLFACTOR set to 100%). But the empty space on your data file(s) might still contain sensitive data! A SHRINKFILE might help get rid of that (not so) empty space, but that might not be the end of your troubles. You might _STILL_ have (not so) empty space on your files! One approach that you can follow is to export all of the data on your database to another SQL Server instance that does NOT have Instant File Initialization enabled. This can be a tedious and time-consuming process, though. So you have to weigh in your options and see what makes sense for you. Snapshot Replication is another idea that comes to mind. 2) Does Compression get rid of ghost records (2008)? The answer to this is no. The Ghost Records/Ghost Cleanup Task mechanism is alive and well on compressed tables and indexes. You can prove this running a simple script: CREATE DATABASE GhostRecordsTest GO USE GhostRecordsTest GO CREATE TABLE myTable (myPrimaryKey int IDENTITY(1,1) PRIMARY KEY CLUSTERED,                       myWideColumn varchar(1000) NOT NULL DEFAULT 'Default string value')                         ALTER TABLE myTable REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) GO INSERT INTO myTable DEFAULT VALUES GO 10 DELETE myTable WHERE myPrimaryKey % 2 = 0 DBCC TRACEON(2514) DBCC CHECKTABLE(myTable) TraceFlag 2514 will make DBCC CHECKTABLE give you an extra tidbit of information on its output. For the above script: “Ghost Record count = 5” Until next time,   -Argenis

    Read the article

< Previous Page | 93 94 95 96 97 98 99 100 101 102 103 104  | Next Page >