Search Results

Search found 886 results on 36 pages for 'duplicates'.

Page 21/36 | < Previous Page | 17 18 19 20 21 22 23 24 25 26 27 28 | Next Page >

VMware ESXi 4 On-Disk Data Deduplication - possible and supported?

- by hurikhan77

Environment: We are running multiple web, database, and application servers which usually share a pretty common installation (gentoo linux) and similar configuration in VMware ESXi 4. The differences are usually only some installed features or differing component versions. To create a new server, I usually choose the most similar (by features) running server, rsync a copy of it into freshly mounted filesystems, run grub, reconfigure and reboot. Problem: Over time this duplicates many on-disk data blocks which probably sums up to several 10's of gigabytes. I suppose if I could use a base system as template with the actual machines based on top of that, only writing changed blocks to some sort of "diff image", performance should improve (increased cache hit rate) and storage efficiency should increase (deduplicated storage space). This would be similar to what ESXi already supports for RAM deduplication (page sharing). Question: Is there any way to easily do this on ESXi 4? I already share the portage tree via NFS but this would not work for the rootfs.

Read the article
Better Performance with Laptop i5+SSD or i7+HDD [closed]

- by Cas Sakal

Possible Duplicates: i5 vs. i7 processor dev laptop Developer Notebook i5 or i7 I could not decide on which configuration below gives better performance for developers on a notebook(running VS.NET, SQL Server, no gaming); (A) i5 540M + SSD (Intel) or (B) i7 720M + 7200 RPM HDD (Western Digital) Since these two configurations(A, B) costs nearly same $ for me, I would like to buy the fastest one for my job environment. Please do not comment just buy this or that, if you can give an inference about your choice I would be appreciated. Thank you, cas

Read the article
Comparing, merging, calculating colums of data in Excel

- by hickster

I would like to create a formula that a) compares four columns of data (see below) Sep Oct name units name units apple 2 apple 3 pear 3 pear 7 orange 4 banana 6 banana 3 toffee 5 then b) merges the two "names" column into one column, dropping any duplicates but still retaining the two unit columns (for months Sep and Oct) Sep Oct name units units apple 2 3 pear 3 7 orange 4 0 banana 3 6 toffee 0 6 then c) creates a third column that compares "Sep units" against "Oct units" and produces the total in the "difference" column Sep Oct name units units difference apple 2 3 1 pear 3 7 4 orange 4 0 -4 banana 3 6 3 toffee 0 6 6

Read the article
What's cool about Lisp nowadays? [closed]

- by Kos

Possible Duplicates: Why is Lisp useful? Is LISP still useful in today's world? Which version is most used? First of all, let me clarify: I'm aware of Lisp's place in history, as well as in education. I'm asking about its place in practical application, as of 2011. The question is: What features of Lisp make it the preferred choice for projects today? It's widely used in various AI areas as far as I know, and probably also elsewhere. I can imagine projects choosing, for instance... Python because of its concise, readable syntax and it being dynamic, Haskell for being pure functional with a powerful type system, Matlab/Octave for the focus on numerics and big standard libraries, Etc. When should I consider Lisp the proper language for a given problem? What language features make it the preferred choice then? Is its "purity and generality" an advantage which makes it a better choice for some subset of projects than the modern languages? edit- On your demand, a little rephrase (or simply a tl;dr) to make this more specific: a) What problems are solvable with Lisp much more easily than with more common, modern languages like Python or C# (or even F# or Scala)? b) What language features specific for Lisp make it the best choice for those problems?

Read the article
Software licensing template that gives room for restricting usage to certain industries/uses of software/source

- by BSara

*Why this question is not a duplicate of the questions specified as such: I did not ask if there was a license that restricted specific uses and I did not ask if I could rewrite every line of any open source project. I asked very specifically: "Does there exist X? If not, can I Y with Z?". As far as I can tell, the two questions that were specified as duplicates do not answer my specific question. Please remove the duplicate status placed on the question. I'm developing some software that I would like to be "semi" open source. I would like to allow for anyone to use my software/source unless they are using the software/source for certain purposes. For example, I don't want to allow usage of the software/source if it is being used to create, distribute, view or otherwise support pornography, illegal purposes, etc. I'm no lawyer and couldn't ever hope to write a license myself nor do I have to time to figure how to best do this. My question is this: Does there exist a freely available license or a template for a license that I can use to license my software under they conditions explained above just like one can use the Creative Commons licenses? If not, am I allowed to just alter one of Creative Commons licenses to meet my needs?

Read the article
What's the best way to manage error logging for exceptions?

- by Peter Boughton

Introduction If an error occurs on a website or system, it is of course useful to log it, and show the user a polite message with a reference code for the error. And if you have lots of systems, you don't want this information dotted around - it is good to have a single centralised place for it. At the simplest level, all that's needed is an incrementing id and a serialized dump of the error details. (And possibly the "centralised place" being an email inbox.) At the other end of the spectrum is perhaps a fully normalised database that also allows you to press a button and see a graph of errors per day, or identifying what the most common type of error on system X is, whether server A has more database connection errors than server B, and so on. What I'm referring to here is logging code-level errors/exceptions by a remote system - not "human-based" issue tracking, such as done with Jira,Trac,etc. Questions I'm looking for thoughts from developers who have used this type of system, specifically with regards to: What are essential features you couldn't do without? What are good to have features that really save you time? What features might seem a good idea, but aren't actually that useful? For example, I'd say a "show duplicates" function that identifies multiple occurrence of an error (without worrying about 'unimportant' details that might differ) is pretty essential. A button to "create an issue in [Jira/etc] for this error" sounds like a good time-saver. Just to re-iterate, what I'm after is practical experiences from people that have used such systems, preferably backed-up with why a feature is awesome/terrible. (If you're going to theorise anyway, at the very least mark your answer as such.)

Read the article
How to de-dupe identical photos that have a slightly different file size?

- by GJ.

I imported many photos using the new "camera import" feature of Dropbox. Many of those were duplicates of photos previously imported by direct copying from the camera. Strangely, the Dropbox import appears to slightly reduce the file size. E.g. here on the right is the file imported through Dropbox: Comparison of the two files using pdiff returns "Images are binary identical", but tools such as fdupes or even the Picasa "show duplicate files" feature, consider them as unique. What can be the cause of this file size change? Is there any way to undo it? Most importantly: how can I de-dupe efficiently without regard to file size comparison? (running pdiff comparison over all photo pairs in my library is obviously impractical...) A solution for either OS X or Windows would do.

Read the article
can canonical links be used to make 'duplicate' pages unique?

- by merk

We have a website that allows users to list items for sale. Think ebay - except we don't actually deal with selling the item, we just list it for sale and provide a way to contact the seller. Anyhow, in several cases sellers maybe have multiple units of an item for sale. We don't have a quantity field, so they upload each item as a separate listing (and using a quantity field is not an option). So we have a lot of pages which basically have the exact same info and only the item # might be different. The SEO guy we've started using has said we should put a canonical link on each page, and have the canonical link point to itself. So for example, www.mysite.com/something/ would have a canonical link of href="www.mysite.com/something/" This doesn't really seem kosher to me. I thought canonical links we're suppose to point to other pages. The SEO guy claims doing it this way will tell google all these pages are indeed unique, even if they do basically have the same content. This seems a little off to me since what's to stop a spammer from putting up a million pages and doing this as well? Can anyone tell me if the SEO guy's suggestion is valid or not? If it's not valid, then do i need to figure out some way to check for duplicated items and automatically pick one of the duplicates to serve as an original and generate canonical links based off that? Thanks in advance for any help

Read the article
How do you manage updates without a staging environment: CentOS 6.3

- by Gregg Leventhal

I am managing about 20 servers, many of them virtual. They are almost all different purpose, and none are clustered. I have a distributed LAMP stack, a few application servers, some build servers, a few KVM hosts. They are CentOS 6.3 mostly with a few Ubuntu (unfortunately). I don't have the resources to setup a staging environment where I can have duplicates of my machines and test updates before rolling them out. I am taking file backups. What I want to know is how you are approaching backing up your Linux systems. I assume you don't just do yum update, but then how are you choosing the packages worthy of updating? When (if ever) are you updating the kernel, etc.. How do you test updates without a staging environment? Snapshot and hope for the best?

Read the article
Does a bad Internet connection increase bandwidth usage?

- by Synetech

My (Rogers) cable connection has been pretty bad recently (channels 3 and 10 are particularly fuzzy—it’s analog, not digital cable). Not surprisingly, this has caused my cable modem to drop out and have to reestablish a connection a couple of times since it started. The poor connection of course means higher corruption (not necessarily dropped per se) which causes the TCP/IP stack to have to retransmit packets more often. Reduction of bandwidth throughput aside, I got to wondering if it increases the actual bandwidth usage. That is, if there is a high error rate on the line causing packets to have to be retransmitted: Does this increase a bandwidth monitoring program’s numbers? Does the ISP count the retransmitted packets toward the monthly cap? Based on what I remember from my university networking courses and common sense, I have a feeling that the answer to both questions is yes, but I cannot reliably measure the first, and have no authoritative answer for the second. I’m wondering if maybe the retransmitted packets are acknowledged as being duplicates and thus not counted somewhere along the line.

Read the article
Restarting rsyslog re-sends logs again

- by Jay Taylor

I am running Ubuntu 12.04.1 LTS on EC2. I have a bunch of application servers which are configured to forward their logs to a central server via rsyslog. Since putting in Nagios monitoring on the log files on the central server, I've been getting alerts indicating that particular application servers are failing to forward their logs to the centralized server. Logging into the machines and restarting the rsyslog service fixes the problem. However, rsyslog then re-transmits the logs again, resulting in duplicates on the collector. Why is it doing this?

Read the article
Android From Local DB (DAO) to Server sync (JSON) - Design issue

- by Taiko

I sync data between my local DB and a Server. I'm looking for the cleanest way to modelise all of this. I have a com.something.db package That contains a Data Helper and couple of DAO classes that represents objects stored in the db (I didn't write that part) com.something.db --public DataHelper --public Employee @DatabaseField e.g. "name" will be an actual column name in the DB -name @DatabaseField -salary etc... (all in all 50 fields) I have a com.something.sync package That contains all the implementation detail on how to send data to the server. It boils down to a ConnectionManager that is fed by different classes that implements a 'Request' interface com.something.sync --public interface ConnectionManager --package ConnectionManagerImpl --public interface Request --package LoginRequest --package GetEmployeesRequest My issue is, at some point in the sync process, I have to JSONise and de-JSONise my data (E.g. the Employee class). But I really don't feel like having the same Employee class be responsible for both his JSONisation and his actual representation inside the local database. It really doesn't feel right, because I carefully decoupled the rest, I am only stuck on this JSON thing. What should I do ? Should I write 3 Employee classes ? EmployeeDB @DatabaseField e.g. "name" will be an actual column name in the DB -name @DatabaseField -salary -etc... 50 fields EmployeeInterface -getName -getSalary -etc... 50 fields EmployeeJSON -JSON_KEY_NAME = "name" The JSON key happens to be the same as the table name, but it isn't requirement -name -JSON_KEY_SALARY = "salary" -salary -etc... 50 fields It feels like a lot of duplicates. Is there a common pattern I can use there ?

Read the article
How to refactor a myriad of similar classes

- by TobiMcNamobi

I'm faced with similar classes A1, A2, ..., A100. Believe it or not but yeah, there are roughly hundred classes that almost look the same. None of these classes are unit tested (of course ;-) ). Each of theses classes is about 50 lines of code which is not too much by itself. Still this is way too much duplicated code. I consider the following options: Writing tests for A1, ..., A100. Then refactor by creating an abstract base class AA. Pro: I'm (near to totally) safe by the tests that nothing goes wrong. Con: Much effort. Duplication of test code. Writing tests for A1, A2. Abstracting the duplicated test code and using the abstraction to create the rest of the tests. Then create AA as in 1. Pro: Less effort than in 1 but maintaining a similar degree of safety. Con: I find generalized test code weird; it often seems ... incoherent (is this the right word?). Normally I prefer specialized test code for specialized classes. But that requires a good design which is my goal of this whole refactoring. Writing AA first, testing it with mock classes. Then inheriting A1, ..., A100 successively. Pro: Fastest way to eliminate duplicates. Con: Most Ax classes look very much the same. But if not, there is the danger of changing the code by inheriting from AA. Other options ... At first I went for 3. because the Ax classes are really very similar to each other. But now I'm a bit unsure if this is the right way (from a unit testing enthusiast's perspective).

Read the article
Asus laptop not retaining Aero theme when unplugged

- by expiredninja

The problem i'm having occurs when i unplug my laptop, it disregards several elements of my theme. The taskbar becomes unlocked. My desktop background is turned white. The transparency on my windows goes away. I can fix these things by running the Aero trouble shooting utility, which mentions something about the color bit depth These are these specs for the computer: Asus Laptop / Intel® Pentium® Processor / 15.6" Display / 4GB Memory Model: X54H-BD3MA SKU: 4005394 I'm sure there are duplicates of this, just not sure whick one refers to me. Thank you. Also i think someone should add an "unplugged" tag.

Read the article
Stop animation playing automatically

- by Starkers

I've created an animation to animate a swinging mace. To do this I select the mace object in the scene pane, open the animation pane, and key it at a certain position at 0:00. I'm prompted to save this animation in my assets folder, which I do, as maceswing I then rotate the mace, move the slider through time and key it in a different position. I move the slider through time again, move the object to the original position and key it. There are now three things in my assets folder: maceswing appears to be my animation, but I have no idea what Mace Mace 1 and Mace 2 are. (I've been mucking around trying to get this working so it's possible Mace 1 and Mace 2 are just duplicates of Mace. I still want to know what they are though) When I play my game, the mace is constantly swinging, even though I didn't apply maceswing to it. I can't stop it. People say there's some kind of tick box to stop it constantly animating but I can't find it. My mace object only has an Animator component: Unticking this component doesn't stop the animation playing so I have no idea where the animation is coming from. Or what the Animator component actually does. I don't want this animation constantly playing. I only want it to play once when someone clicks a certain button: var Mace : Transform; if(Input.GetButtonDown('Fire1')){ Mace.animation.Play('maceswing'); }; Upon clicking the 'Fire1' button, I get this error: MissingComponentException: There is no 'Animation' attached to the "Mace" game object, but a script is trying to access it. You probably need to add a Animation to the game object "Mace". Or your script needs to check if the component is attached before using it. There is no 'Animation' attached to the "Mace" game object, and yet I can see it swinging away constantly. Infact I can't stop it! So what's causing the animation if the game object doesn't have an 'Animation' attached to it?

Read the article
Why does my excel document have 960,000 empty rows?

- by C-dizzle

I have an excel document, Office 2007, on a Windows 7 machine (if that part matters any, I'm not sure but just throwing it out there). It is a list of all employee phone numbers. If I need to generate a new page, I can click on page 2 and the table will automatically generate again. The problem is, someone messed it up since it's on a network drive and now shows I have over 960,000 rows of data, when I really don't! I did CTRL+END to see if any data was in the last cell, so I cleared it out, deleted that row and column, but still didn't fix it. It almost seems like it duplicates itself after the deletion. How can I fix this instead of recreating the entire document?

Read the article
Apply SetEnvIf after Apache RewriteRule

- by coneybeare

I have a working apache rewrite rule: RewriteCond %{HTTP_HOST} ^.*foo.com RewriteRule (.*) http://bar.com$1 [R=301,QSA,L] and some working dontlog SetEnvIfs: SetEnvIf Request_URI "^/server-status$" dontlog SetEnvIf Request_URI "^/home/ping$" dontlog SetEnvIf Request_URI "^/haproxy-status$" dontlog SetEnvIf User-Agent ".*internal dummy connection.*" dontlog CustomLog /var/log/apache2/access.log combined env=!dontlog but I can't figure out how to stop the RewriteRule from logging a duplicate line. foo.com and bar.com are both on the same machine. I would expect this rule to work, but it did not: SetEnvIf Host "foo.com" dontlog I still get duplicates in the Apache Log: 10.250.18.97 - - [06/Apr/2012:16:57:12 +0000] "GET / HTTP/1.1" 200 732 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/534.55.3 (KHTML, like Gecko) Version/5.1.5 Safari/534.55.3" 68.194.30.42 - - [06/Apr/2012:16:57:12 +0000] "GET / HTTP/1.1" 200 732 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/534.55.3 (KHTML, like Gecko) Version/5.1.5 Safari/534.55.3" .... where 10.250.18.97 is the server's IP. How can I prevent that RewriteRule from logging?

Read the article
How to choose a server side language / framework

- by pllee

I am trying to come up with a list / ranking system on determining which server language to choose for a particular website. Assume that familiarity with a certain language is not important and the implementation can be done in any language. Here are some things that might be important but I am not sure how to rank them : Maintainability. Libraries. For example, Memcached and NoSql support right out the box would be really nice addition to a particular framework. 3rd party SDK's. For example, if I need Paypal on my site they openly provide SDK's for all senarios in Java, PHP and .Net. If I choose Django I would have to rely on 3rd party libraries that are don't support everything and are not officially maintained. Would that be dealbreaker for Django? Performance This one is tricky to put on a generic list because it can be a deal breaker but for many websites performance will not be an issue that the language/framework is responsible for. Cost (hosting, open source). edit - Any reason for the votes to close? I didn't see any duplicates mentioned and the question should not drum up a flame war.

Read the article
Is the way I'm implementing my genetic algorithm right?

- by Mhjr

In my graduation project, I am asked to use a genetic algorithm (any variation of it can be chosen) to generate valid timetables. What I did was make a simple program that generates unique sequences representing genes, the sequence is described below: (sorry if it's mathematically incorrect) The only variable in the sequence is the room element, so basically the program takes a tree that goes like this: [Course] -(contains)-> [Units] -(contains)-> [Offerings] -(contains)-> [Instructors] -(contains)-> [Rooms] Each course can have n units (duplicates). Each unit can have n offerings (lectures,lab session, excercises,...). Each offering has only 1 instructor. Each instructor (or the whole lecture composed from the four elements of the sequence) has multiple rooms. When a timetable is initialized, one of these sequences that differ in rooms will be taken into the timetable, so the difference in genes (sequences) of each timetable will be just the rooms random choice and the difference between chromosomes (timetables) will be time placements of these genes (sequences). My question is, before I proceed in implementing what I described, is it valid? Is the representation used here for chromosomes a permutation representation?

Read the article
Duplicated menu, panel indicators and taskbar

- by Mykro

Ubuntu 12.04. The first time I log into Gnome Classic I get a duplicate of every menu, panel indicator and taskbar entry. If I log out and back in again I now have three copies. Log out and log in, four copies and so forth. What would cause this? I can't see any obvious duplicates in the process list: $ ps -A | grep 'gno\|org\|nau' 27439 tty7 00:00:18 Xorg 27610 ? 00:00:00 gnome-keyring-d 27621 ? 00:00:00 gnome-session 27674 ? 00:00:00 gnome-settings- 27709 ? 00:00:07 gnome-panel 27720 ? 00:00:00 gnome-fallback- 27726 ? 00:00:04 nautilus 27736 ? 00:00:00 polkit-gnome-au 28281 ? 00:00:00 gnome-screensav 29016 ? 00:00:00 gnome-terminal 29021 ? 00:00:00 gnome-pty-helpe If it helps, I was recently trying the Nouveau drivers but have now reverted to NVIDIA. Configuration is Separate X Window, Xinerama enabled. Unity is fine, so the problem is limited to the Classic desktop. I haven't had any luck googling these particular symptoms so any tips would be really appreciated. Thanks!

Read the article
Restore Picasa people tags

- by Paul

I have loaded Windows 7 to my laptop. Before doing this I backed up all my pictures using the Picasa backup utility. I then ran a restore on the clean Windows 7 install. I then installed Picasa 3.5 and none of the people tags showed up. I then went and deleted what I thought was the Picasa DB and then tried running the restore again. Now each folder shows up twice in Picasa but only once under the Windows Pictures folder. How do I get rid of the duplicates in Picasa and get my people tags back?

Read the article
Finding maximum number of congruent numbers

- by Stefan Czarnecki

Let's say we have a multiset (set with possible duplicates) of integers. We would like to find the size of the largest subset of the multiset such that all numbers in the subset are congruent to each other modulo some m 1. For example: 1 4 7 7 8 10 for m = 2 the subsets are: (1, 7, 7) and (4, 8, 10), both having size 3. for m = 3 the subsets are: (1, 4, 7, 7, 10) and (8), the larger set of size 5. for m = 4 the subsets are: (1), (4, 8), (7, 7), (10), the largest set of size 2. At this moment it is evident that the best answer is 5 for m = 3. Given m we can find the size of the largest subset in linear time. Because the answer is always equal or larger than half of the size of the set, it is enough to check for values of m upto median of the set. Also I noticed it is necessary to check for only prime values of m. However if values in the set are large the algorithm is still rather slow. Does anyone have any ideas how to improve it?

Read the article
Location-Based redirection and duplication in sub-directories affecting SEO

- by Joshua

I currently own the website www.xyz.com. The website has a sub-directory for each of the 3 target countries: .../en-US/ (United States), .../es-MX/ (Mexico), and .../es-DO/ (Dominican Republic). I have two main questions about this setup: Currently, the main domain/root (xyz.com) contains a blank index.php file, but I would like for a user to be redirected to one of the sub-directories based on their regional location. What is the best way to accomplish this? I have looked at using browser language-based redirection, but how would I know whether to direct a user to the MX or DO site if the browser language is set to spanish? Is there a way to detect a user's geographic location? Also, the 3 websites are practically identical except they all have 3 unique color schemes and the US site is in english while the MX and DO sites are in spanish. My problem is that I believe GoogleBot is penalizing/banning my site because the spanish text on the MX and DO pages are nearly identical and are thus marked as duplicates/spam. Is there a way to avoid this?

Read the article
Dropping duplicate|redundant Unique Constraint from FILESTREAM table

- by electricsk8

I have a table with a FILESTREAM column, and it has two unique constraints specified for the same FILESTREAM column, ie: ALTER TABLE [dbo].[TableName] ADD CONSTRAINT [UQ_TableName_33C4988760FC61CA] UNIQUE NONCLUSTERED ([GUID_Column]); GO ALTER TABLE [dbo].[TableName] ADD CONSTRAINT [UQ_TableName_33C49887145C0A3F] UNIQUE NONCLUSTERED ([GUID_Column]); GO I'd like to drop one of the unique constraints, as they are duplicates. However, when I try and drop one of the two duplicate constraints, I receive the following error. "A table with FILESTREAM column(s) must have a non-NULL unique ROWGUID column." Anyone know how to remove one of the two constraints?

Read the article
Remove Duplicate Messages from Maildir

- by Joseph Holsten

I've got a bunch of duplicate messages in my IMAP server's Maildir. What's the best way to remove them? Some relevant points: Shared Message-ID is usually a good enough definition of duplicate. A tiny script that removes all but one of the duplicate messages would work. Sometimes it's necessary to find duplicates based on shared message bodies. What's a reasonable definition of shared here? Bitwise equivalent? What about weird differences in line wrapping, escaping, character encoding? Sometimes there's some meaningful difference between 'duplicate' messages. What's the best way to review the differences in sets of 'duplicate' messages? Diffs?

Read the article

< Previous Page | 17 18 19 20 21 22 23 24 25 26 27 28 | Next Page >