Search Results

Search found 886 results on 36 pages for 'duplicates'.

Page 28/36 | < Previous Page | 24 25 26 27 28 29 30 31 32 33 34 35 | Next Page >

How to add extensions to a lot of files using content of each file?

- by v8media

I've got over 10,000 files that don't have extensions from older versions of the Mac OS. They're extremely nested, and they also have all sorts of strange formatting and characters. They don't have file types or creator codes attached to them any longer. A great deal of these files have text in the file that will let me determine extensions (for example Word.Document.8 is in every file created by that version of Word, and Excel.Sheet.8 in every file created with that version of Excel). I found a script that looks like it would work for one of these file types at a time, but it erases parts of filenames after nefarious characters, which is not good. find . -type f -not -name "." -print0 |\ xargs -0 file |\ grep 'Word.Document.8' |\ sed 's/:.*//' |\ xargs -I % echo mv % %.doc So, two questions from that: One is, should I clean the characters in the filenames first, or programmatically deal with those in the script in order to leave them the same? As long as I lose no information from the filenames, I don't see a problem cleaning out slashes and other problem characters. Also, if I clean the filenames, there are likely to be duplicates, so any cleaning script would have to add something like "-1" before the extension to make sure nothing gets lost. 2nd question is how do I change the script so that it will look for more than one file type at the same time and give each the proper extension? I'm not tied to this script, but it is understandable, which is a pro. Mac OS X 10.6 is installed on this file server, but I've got access to any recent versions of OS X. Thanks, Ian

Read the article
Provider claiming "all web servers in the cloud are automatically kept in sync" - should I be skeptical?

- by RobMasters

I'm no expert in cloud computing - I've spent a fair bit of time researching it and various providers but am yet to get any hands-on experience with it. From what I've read about AWS and auto-scaling EC2 instances though, it seems as though each instance should be completely decoupled from all other instances. i.e. If content is uploaded to the web server's local filesystem from a custom CMS backend then that content won't be available if subsequently requested from a different web server in the auto-scaling group. Is that right? I met with a representative of our existing hosting provider recently and he was claiming that it isn't a problem that our legacy CMS system is highly dependent on having a local filesystem. He said that all web servers, regardless of how many, would be kept as exact duplicates so I shouldn't notice any difference compared to our existing setup of a single dedicated server. This smells a little too much like bull fecal-matter to me...should I be skeptical about this? I'm a little worried because my (non-technical) boss who ultimately makes the decisions is all for signing up to this cloud solution because it won't require any extra work. I'm sure that they must at least be able to provide this, otherwise they wouldn't be attempting to sell it to us. But at what cost? It sounds as though each web server will always need to be checking the other web server(s) for new static content, which to me sounds like unwanted overhead that'll slow things down. I'd really appreciate it if somebody could clear this up to me. I'm all for switching to AWS and using S3+CloudFront for all static content, but that isn't looking very likely to happen at the moment.

Read the article
How to find what files / directories are not copied yet?

- by user8676

Hi all, I found the following 'nice' situation: An archive of few disks (actually three disks) which has a bunch of photos (more or less) organized. Well, this is good. A big disk shared on a network which has a bunch of photos which has another folder structure (even if is somewhat recognizable for a human being) than the archive described above, but some of the files on this big network share are the same with the files from the archive. Well, this is bad. What we need is to move the different (new) files from the network share in the archive (perhaps we'll use for this a new disk added to archive). The program that we need is different from a regular File Duplicate Finder program because usually the File Duplicate Finder finds the duplicates from all sources comparing each file with another. We want to find the differences between the two sources. It is fine for us to have a report generated in text file which after this we'll use to do our move. A Windows solution will be preferred. Any ideas? TIA

Read the article
Cloning a failing disk (Win 7)

- by daveh551

I have a Windows 7 machine with several partitions on a 1.5T drive. Windows has been complaining about disk errors and imminent failure, so I have purchased a new 2TB drive. The failing disk has not completely failed, and, in fact, I was able to boot Windows from it (after a couple tries) and examine the SMART logs - the only RED item was 1 sector being reallocated. But when I try to Clone it to the new Drive using Acronis True Image Home (2010), True Image can see the drive, the partitions, and the contents, but when it goes to actually do the clone, it says "Failed to move. Make sure the destination disk is not smaller than the source disk, and that there are not errors on the disk" (or something like that). What are some other options for simply cloning the failing drive. I'd like to clone the entire disk, but am willing to do it partition by partition if necessary. Was this a known failing of the 2010 edition of ATI, or is it really something hosed in my system. Would upgrading to the 2012 edition be likely to work any better? (I'd download the trial and try it out, but if I remember right, the cloning operation is disabled in the trial version), and I don't have enough free disk space to make an entire image.) What are some other cloning software packages if ATI won't work? Note that I'm only looking to clone the disk, not make an image as a back up - I use Ghost for that, and can fall back to that if I have to. It looks to me like CloneZilla would do the job. Any recommendations? Thanks, and if this duplicates other questions, I apologize.

Read the article
How to make Thunderbird play nice with Google mail

- by Christi

Thunderbird and gmail aren't exactly the best of friends. Gmail's tags mean that Thunderbird often downloads multiple copies of a single mail. Anything tagged in gmail will appear in a folder related to that tag, the "all mail" folder, and possibly the "inbox" and "sent mail" folders too. Thus a mail with multiple tags could potentially be stored more than four times in a local Thunderbird cache. This can make searching difficult, and is obviously wasteful of disk space. The best solution I have come up with is as follows. Operate a zero inbox policy (i.e. use the inbox for processing live mail only and archive everything else) which eliminates an extra copy in the inbox. Secondly, configure Thunderbird not to sync the "Sent Mail" folder - this is a bit of a pain, since I actually find it quite useful to be able to look through just the mails I've sent, but a search can duplicate this functionality. In this way, most of the duplicates are removed, and only mail with tags is stored locally more than once. Ideally, however, I'd only like one copy of each mail to be stored locally. I am surprised Thunderbird doesn't store mail by some sort of hashing algorithm to prevent precisely this problem - but it wouldn't be compatible with the way the folders are mirrored in a local directory structure, I suppose. Can anyone think of a better way to get Thunderbird to cache a Google mail account locally efficiently.

Read the article
Website latency and bad tcp packets

- by Mistero Lupo

I have multiple websites hosted on a Linode VPS and I'm having an issue with one of them: every page that I try to load has about 10 seconds latency. Apache logs are clean and the other websites on the same machine are running well. At a first glance I tought it was a memory problem since the VPS has got only 512M, but from the linode dashboard CPU and Disk I/O are normal. Anyway here we have the ram status: $ free -m total used free shared buffers cached Mem: 487 463 23 0 2 55 -/+ buffers/cache: 404 82 Swap: 255 155 100 Only 23M free, but if it was a memory problem why other websites are going as usual? I took a live capture with wireshark, and there are some duplicates SYN ACK packets just before the 10 seconds gap. I'm out of ideas, looking for some clues. Wireshark live capture screenshot As you can see from the image, the gap is after the last bad tcp. Thank you in advance. UPDATE I've checked Apache2 logs in debug error level, and this is where something is appening: 151.97.156.191 - - [14/Nov/2012:11:19:40 +0100] [www.fmaisi.it/sid#7f32c625a220][rid#7f32c6801578/subreq] (3) [perdir /home/fmaisi/sites/www.fmaisi.it/public_html/] applying pattern '^index\.php$' to uri 'index.php' 151.97.156.191 - - [14/Nov/2012:11:19:40 +0100] [www.fmaisi.it/sid#7f32c625a220][rid#7f32c6801578/subreq] (1) [perdir /home/fmaisi/sites/www.fmaisi.it/public_html/] pass through /home/fmaisi/sites/www.fmaisi.it/public_html/index.php 151.97.156.191 - - [14/Nov/2012:11:19:54 +0100] [www.fmaisi.it/sid#7f32c625a220][rid#7f32c6537c78/initial] (3) [perdir /home/fmaisi/sites/www.fmaisi.it/public_html/] strip per-dir prefix: /home/fmaisi/sites/www.fmaisi.it/public_html/wp-content/plugins/wp-filebase/wp-filebase_css.php -> wp-content/plugins/wp-filebase/wp-filebase_css.php 151.97.156.191 - - [14/Nov/2012:11:19:54 +0100] [www.fmaisi.it/sid#7f32c625a220][rid#7f32c6537c78/initial] (3) [perdir /home/fmaisi/sites/www.fmaisi.it/public_html/] applying pattern '^index\.php$' to uri 'wp-content/plugins/wp-filebase/wp-filebase_css.php' As you can see there is a gap of 14 seconds after the pass through of index.php. Any suggestions? I'm out of ideas again.

Read the article
ZFS & Deduplicating FLAC Data

- by jasongullickson

I'm experimenting with using ZFS to deduplicate a large library of FLAC files. The purpose of this is twofold: Reduce storage utilization Reduce bandwidth needed to sync the library with cloud storage Many of these files are of the same music tracks but from different physical media. This means that for the most part they are the same and usually close to the same size, which makes me think that they should benefit from block-level deduplication. However in my testing I'm not seeing good results. When I create a pool and add three of these tracks (identical songs from different source media) zpool list reports 1.00 dedupe. If I copy all of the files (make exact duplicates of the three) dedupe climbs, so I know that it is enabled and functioning, but it's not finding any duplication in the original collection of files. My first thought was that perhaps some of the variable header data (metadata tags, etc.) might be mis-aligning the bulk of the data in these files (the audio frames) but even making the header data consistent across the three files doesn't seem to have any impact on deduplication. I'm considering taking alternate routes (testing other dedupe filesystems as well as some custom code) but since we're already using ZFS and I like the ZFS replication options, I'd prefer to use ZFS dedupe for this project; but perhaps it's simply not capable of working well with this sort of data. Any feedback regarding tuning that might improve dedupe performance for this sort of dataset, or confirmation that ZFS dedupe is not the right tool for this job are appreciated.

Read the article
How to organize SQL script files

- by Mehper C. Palavuzlar

We have an Oracle 10g database (a huge one) in our company, and I provide employees with data upon their requests. My problem is, I save almost every SQL query I wrote, and now my list has grown too long. I want to organize and rename these .sql files so that I can find the one I want easily. At the moment, I'm using some folders named as Sales Dept, Field Team, Planning Dept, Special etc. and under those folders there are .sql files like Delivery_sales_1, Delivery_sales_2, ... Sent_sold_lostsales_endpoints, ... Sales_provinces_period, Returnrates_regions_bymonths, ... Jack_1, Steve_1, Steve_2, ... I try to name the files regarding their content but this makes file names longer and does not completely meet my needs. Sometimes someone comes and demands a special report, and I give the file his name, but this is also not so good. I know duplicates or very similar files are growing in time but I don't have control over them. Can you show me the right direction to rename all these files and folders and organize my queries for easy and better control? TIA.

Read the article
Get an IDataReader from a typed List

- by Jason Kealey

I have a List<MyObject> with a million elements. (It is actually a SubSonic Collection but it is not loaded from the database). I'm currently using SqlBulkCopy as follows: private string FastInsertCollection(string tableName, DataTable tableData) { string sqlConn = ConfigurationManager.ConnectionStrings[SubSonicConfig.DefaultDataProvider.ConnectionStringName].ConnectionString; using (SqlBulkCopy s = new SqlBulkCopy(sqlConn, SqlBulkCopyOptions.TableLock)) { s.DestinationTableName = tableName; s.BatchSize = 5000; s.WriteToServer(tableData); s.BulkCopyTimeout = SprocTimeout; s.Close(); } return sqlConn; } I use SubSonic's MyObjectCollection.ToDataTable() to build the DataTable from my collection. However, this duplicates objects in memory and is inefficient. I'd like to use the SqlBulkCopy.WriteToServer method that uses an IDataReader instead of a DataTable so that I don't duplicate my collection in memory. What's the easiest way to get an IDataReader from my list? I suppose I could implement a custom data reader (like here http://blogs.microsoft.co.il/blogs/aviwortzel/archive/2008/05/06/implementing-sqlbulkcopy-in-linq-to-sql.aspx) , but there must be something simpler I can do without writing a bunch of generic code. Edit: It does not appear that one can easily generate an IDataReader from a collection of objects. Accepting current answer even though I was hoping for something built into the framework.

Read the article
Castle MonoRail ARDataBind trying to bind to non-existent row

- by dave thieben

I have a shopping cart application running on MonoRail and using Castle ActiveRecord/NHibernate, and there is a ShoppingCart table and a ShoppingCartItems table, which are mapped to entities. Here's the scenario: a user adds things to the shopping cart, say 5 items, and goes to view the cart. The cart shows all 5 items. the user duplicates the tab/window and gets another tab of the same cart (call it tab B). the user removes an item from the cart, so now there are 4 items in tab B, but in the original tab A, there are still 5 items. the user goes back to tab A, and updates something in the cart and clicks the "update" button which submits the changes. my MonoRail action tries to do an ARDataBind on ShoppingCartItems using the data from the view, which includes all 5 items. when it gets to the item that the user deleted from tab B, it throws a "No row with the given identifier exists" for that item. I can't figure out if there is a way to have it not bind that row, return null, return new instance, etc.? there is an AutoLoadBehavior parameter on the ARDataBind attribute, but that appears to only affect loading of child entities, and not the root entity. regardless of which option I choose, I get the exception before control even enters the action method (except AutoLoadBehavior.Never, but that doesn't really help me). instead, I have code that calls Request.ObtainParamsNode() to pull the form nodes and parse them manually into objects, and ignores the ones that no longer exist. is there a better way? thanks.

Read the article
NHibernate. Distinct parent child fetching

- by Andrew Kalashnikov

Hello. I've got common NH mapping; <class name="Order, SummaryOrder.Core" table='order'> <id name="Id" unsaved-value="0" type="int"> <column name="id" not-null="true"/> <generator class="native"/> </id> <many-to-one name="Client" class="SummaryOrderClient, SummaryOrder.Core" column="summary_order_client_id" cascade="none"/> <many-to-one name="Provider" class="SummaryOrderClient, SummaryOrder.Core" column="summary_order_provider_id" cascade="none"/> <set name="Items" cascade="all"> <key column="order_id"/> <one-to-many class="OrderItem, Clients.Core" /> </set> </class> Want get list by this criteria ICriteria criteria = NHibernateStateLessSession.CreateCriteria(typeof(SummaryOrder.Core.Domains.Order)); ; criteria.Add(Restrictions.Or (Restrictions.Eq(String.Format("{0}.Id", SummaryOrder.Core.Domains.Order.Properties.Client), idClient), Restrictions.Eq(String.Format("{0}.Id", SummaryOrder.Core.Domains.Order.Properties.Provider), idClient))). SetResultTransformer(new DistinctRootEntityResultTransformer()). SetFetchMode(SummaryOrder.Core.Domains.Order.Properties.Items, FetchMode.Join); return criteria.List<SummaryOrder.Core.Domains.Order>() as List<SummaryOrder.Core.Domains.Order> But I've got duplicates.. When I execute One restriction (without OR) I got distinct collection of orders, but Restriction OR brakes my query. I wanna get distinct(at client yet) collection of orders. What's wrong. Please HELP!

Read the article
How to get decent MySQL driver perfomance in Ruby

- by Zombies

I notice that I am getting very poor performance for either or both inserts and queries. The queries themselves are basic and can execute with no delay directly from mysql. The ruby script that I wrote is only 1 thread, so only 1 connection is being used, and never closed unless the script is terminated. Pretty basic, I am just trying to insert a lot of rows. There is a look-up or two to get a surrogate key, or to check for duplicates, but the complexity is just O(n). Also, it isn't like there are millions of records, so again the queries themselves take no time to run. I am using: Ruby 1.9.1 Gem/driver:ruby-mysql 2.9.2 MySQL 5.1.37-1ubuntu5.1 ^ all 32 bit versions on a 32bit ubuntu distro I am getting about 1-2 inserts per second, pretty slow. I know a lot of people will suggest to change drivers, but that means I have some refactoring and resting to do. So I would really appreciate any help, but please if you do recomend that at least say why you do (eg: if you have used ruby-mysql x.x.x before and found another mysql driver to be better).ruby-mysql 2.9.2 What I would like to know: How can I improve performance with ruby-mysql 2.9.2 If and only if I cannot do this with ruby-mysql 2.9.2, what should I do?

Read the article
Entity Framework query not returning correctly enumerated results.

- by SkippyFire

I have this really strange problem where my entity framework query isn't enumerating correctly. The SQL Server table I'm using has a table with a Sku field, and the column is "distinct". It isn't a key, but it doesn't contain any duplicate values. Using actual SQL with where, distinct and group by cluases I have confirmed this. However, when I do this: // Not good foreach(var product in dc.Products) or // Not good foreach(var product in dc.Products.ToList()) or // Not good foreach(var product in dc.Products.OrderBy(p => p.Sku)) the first two objects that are returned ARE THE SAME!!! The third item was technically the second item in the table, but then the fourth item was the first row from the table again!!! The only solution I have found is to use the Distinct extension method, which shouldn't really do anything in this situation: // Good foreach(var product in dc.Products.ToList().Distinct()) Another weird thing about this is that the count of the resulting queries is the same!!! So whether or not the resulting enumerable has the correct results or duplicates, I always get the number of rows in the actual table! (No I don't have a limit clause anywhere). What could possibly cause this!?!?!?

Read the article
Can I copy the CollapsiblePanelExtender in jQuery as one method?

- by Matthew Jones

I am beginning the process of moving away from the AjaxControlToolkit and toward jQuery. What I want to do is have one function that duplicates the functionality of the CollapsiblePanelExtender. For a particular set of hyperlink and div, the code looks like this: $('#nameHyperLink').click(function() { var div = $('#nameDiv'); var link = $('#nameHyperLink'); if (div.css('display') == 'none') { link.text('Hide Data'); div.show(400); } else { link.text('Show Data'); div.hide(400); } }); What I really want to do is only have to write this function once, then use it for many (approx 40) instances throughout my website. Ideally what I want is this: function showHidePanel(divID,linkID,showText,hideText){ var div = $(divID); var link = $(linkID); if (div.css('display') == 'none') { link.text('Hide Data'); div.show(400); } else { link.text('Show Data'); div.hide(400); } }); I would then call this function from every HyperLink involved using OnClientClick. Is there a way to do this?

Read the article
The subsets-sum problem and the solvability of NP-complete problems

- by G.E.M.

I was reading about the subset-sums problem when I came up with what appears to be a general-purpose algorithm for solving it: (defun subset-contains-sum (set sum) (let ((subsets) (new-subset) (new-sum)) (dolist (element set) (dolist (subset-sum subsets) (setf new-subset (cons element (car subset-sum))) (setf new-sum (+ element (cdr subset-sum))) (if (= new-sum sum) (return-from subset-contains-sum new-subset)) (setf subsets (cons (cons new-subset new-sum) subsets))) (setf subsets (cons (cons element element) subsets))))) "set" is a list not containing duplicates and "sum" is the sum to search subsets for. "subsets" is a list of cons cells where the "car" is a subset list and the "cdr" is the sum of that subset. New subsets are created from old ones in O(1) time by just cons'ing the element to the front. I am not sure what the runtime complexity of it is, but appears that with each element "sum" grows by, the size of "subsets" doubles, plus one, so it appears to me to at least be quadratic. I am posting this because my impression before was that NP-complete problems tend to be intractable and that the best one can usually hope for is a heuristic, but this appears to be a general-purpose solution that will, assuming you have the CPU cycles, always give you the correct answer. How many other NP-complete problems can be solved like this one?

Read the article
NHibernate returning duplicate object in child collections when using Fetch

- by UpTheCreek

When doing a query like this (using Nhibernate 2.1.2): ICriteria criteria = session.CreateCriteria<MyRootType>() .SetFetchMode("ChildCollection1", FetchMode.Eager) .SetFetchMode("ChildCollection2", FetchMode.Eager) .Add(Restrictions.IdEq(id)); I am getting multiple duplicate objects in some cartesian fashion. E.g. if ChildCollection1 has 3 elements, and ChildColection2 has 2 elements then I get results with each element in ChildColection1 one duplicated, and each element in ChildColection2 triplicated! This was a bit of a WTF moment for me... So how to do this correctly? Is using SetFetchMode like this only supported when specifying one collection? Am I just using it wrong (I've seen some references to results transformers, but imagined this would be simplier). Is this something that's different in NH3? Update: As per Felice's suggestion, I tried using the DistinctRootEntity transformer, but this is still returning duplicates. Code: ICriteria criteria = session.CreateCriteria<MyRootType>() .SetFetchMode("ChildCollection1", FetchMode.Eager) .SetFetchMode("ChildCollection2", FetchMode.Eager) .Add(Restrictions.IdEq(id)); criteria.SetResultTransformer(Transformers.DistinctRootEntity); return criteria.UniqueResult<MyRootType>();

Read the article
Time complexity for Search and Insert operation in sorted and unsorted arrays that includes duplicat

- by iecut

1-)For sorted array I have used Binary Search. We know that the worst case complexity for SEARCH operation in sorted array is O(lg N), if we use Binary Search, where N are the number of items in an array. What is the worst case complexity for the search operation in the array that includes duplicate values, using binary search?? Will it be the be the same O(lg N)?? Please correct me if I am wrong!! Also what is the worst case for INSERT operation in sorted array using binary search?? My guess is O(N).... is that right?? 2-) For unsorted array I have used Linear search. Now we have an unsorted array that also accepts duplicate element/values. What are the best worst case complexity for both SEARCH and INSERT operation. I think that we can use linear search that will give us O(N) worst case time for both search and delete operations. Can we do better than this for unsorted array and does the complexity changes if we accepts duplicates in the array.

Read the article
Is ASP.NET MVC destined to replace Webforms?

- by johnny

I found these questions, but a couple of them were a little old: http://stackoverflow.com/questions/191556/should-i-pursue-asp-net-webforms-or-asp-net-mvc http://stackoverflow.com/questions/88787/do-you-think-asp-net-mvc-will-compete-with-asp-net-webforms http://stackoverflow.com/questions/722637/asp-net-mvc-asp-net-webforms-why I do not believe these are duplicates and might be old enough that new light can be shed. If not please close this. I know that no one framework or language is necessarily the only tool for every job. But, do you see MVC eclipsing webforms or webforms going lower on the priority list for Microsoft? They will have to keep webforms for a long time because so many have invested in it, but they don't have to keep adding new functionality for it. I don't know if this is a good example, but it reminds me of web parts. I never saw much improvement in it from Microsoft. It works and I thought it was great until I started to really try and get a lot out of it. Then from what I could see it just wasn't being pursued by Microsoft that much, though it stayed in Visual Studio. Maybe that's a bad example; just what I remembered. EDIT: Also, if anyone has any statements from Microsoft on this subject it is appreciated. No offense to anyone. I was only hoping for something official.

Read the article
Posting messages in two RabbitMQ queue, instead of one (using py-amqp)

- by Khelben

I've got this strange problem using py-amqp and the Flopsy module. I have written a publisher that sends messages to a RabbitMQ server, and I wanted to be able to send it to a specified queue. On the Flopsy module that is not possible, so I tweaked it adding a parameter and a line to declare the queue on the init_ method of the Publisher object def __init__(self, routing_key=DEFAULT_ROUTING_KEY, exchange=DEFAULT_EXCHANGE, connection=None, delivery_mode=DEFAULT_DELIVERY_MODE, queue=DEFAULT_QUEUE): self.connection = connection or Connection() self.channel = self.connection.connection.channel() self.channel.queue_declare(queue) # ADDED TO SET UP QUEUE self.exchange = exchange self.routing_key = routing_key self.delivery_mode = delivery_mode The channel object is part of the py-amqplib library The problem I've got it's that, even if it's sending the messages to the specified queue, it's ALSO sending the messages to the default queue. AS in this system we expect to send quite a lot of messages, we don't want to stress the system making useless duplicates... I've tried to debug the code and go inside the py-amqplib library, but I'm not able figure out any error or lacking step. Also, I'm not able to find any documentation form py-amqplib outside the code. Any ideas on why is this happening and how to correct it?

Read the article
Extra fulltext ordering criteria beyond default relevance

- by Jeremy Warne

I'm implementing an ingredient text search, for adding ingredients to a recipe. I've currently got a full text index on the ingredient name, which is stored in a single text field, like so: "Sauce, tomato, lite, Heinz" I've found that because there are a lot of ingredients with very similar names in the database, simply sorting by relevance doesn't work that well a lot of the time. So, I've found myself sorting by a bunch of my own rules of thumb, which probably duplicates a lot of the full-text search algorithm which spits out a numerical relevance. For instance (abridged): ORDER BY [ingredient name is exactly search term], [ingredient name starts with search term], [ingredient name starts with any word from the search and contains all search terms in some order], [ingredient name contains all search terms in some order], ...and so on. Each of these is defined in the SELECT specification as an expression returning either 1 or 0, and so I order by those in sequential order. I would love to hear suggestions for: A better way to define complicated order-by criteria in one place, say perhaps in a view or stored procedure that you can pass just the search term to and get back a set of results without having to worry about how they're ordered? A better tool for this than MySQL's fulltext engine -- perhaps if I was using Sphinx or something [which I've heard of but not used before], would I find some sort of complicated config option designed to solve problems like this? Some google search terms which might turn up discussion on how to order text items within a specific domain like this? I haven't found much that's of use. Thanks for reading!

Read the article
Detecting duplicate values in a column of a Datatable while traversing through It

- by Ashish Gupta

I have a Datatable with Id(guid) and Name(string) columns. I traverse through the data table and run a validation criteria on the Name (say, It should contain only letters and numbers) and then adding the corresponding Id to a List If name passes the validation. Something like below:- List<Guid> validIds=new List<Guid>(); foreach(DataRow row in DataTable1.Rows) { if(IsValid(row["Name"]) { validIds.Add((Guid)row["Id"]); } } In addition to this validation I should also check If the name is not repeating in the whole datatable (even for the case-sensitiveness), If It is repeating, I should not add the corresponding Id in the List. Things I am thinking/have thought about:- 1) I can have another List, check for the "Name" in the same, If It exists, will add the corresponding Guild 2) I cannot use HashSet as that would treat "Test" and "test" as different strings and not duplicates. 3) Take the DataTable to another one where I have the disctict names (this I havent tried and the code might be incorrect, please correct me whereever possible) DataTable dataTableWithDistinctName = new DataTable(); dataTableWithDistinctName.CaseSensitive=true CopiedDataTable=DataTable1.DefaultView.ToTable(true,"Name"); I would loop through the original datatable and check the existence of the "Name" in the CopiedDataTable, If It exists, I wont add the Id to the List. Are there any better and optimum way to achieve the same? I need to always think of performance. Although there are many related questions in SO, I didnt find a problem similar to this. If you could point me to a question similar to this, It would be helpful. Thanks

Read the article
Lucene complex structure search

- by archer

Basically I do have pretty simple database that I'd like to index with Lucene. Domains are: // Person domain class Person { Set<Pair> keys; } // Pair domain class Pair { KeyItem keyItem; String value; } // KeyItem domain, name is unique field within the DB (!!) class KeyItem{ String name; } I've tens of millions of profiles and hundreds of millions of Pairs, however, since most of KeyItem's "name" fields duplicates, there are only few dozens KeyItem instances. Came up to that structure to save on KeyItem instances. Basically any Profile with any fields could be saved into that structure. Lets say we've profile with properties - name: Andrew Morton - eduction: University of New South Wales, - country: Australia, - occupation: Linux programmer. To store it, we'll have single Profile instance, 4 KeyItem instances: name, education,country and occupation, and 4 Pair instances with values: "Andrew Morton", "University of New South Wales", "Australia" and "Linux Programmer". All other profile will reference (all or some) same instances of KeyItem: name, education, country and occupation. My question is, how to index all of that so I can search for Profile for some particular values of KeyItem::name and Pair::value. Ideally I'd like that kind of query to work: name:Andrew* AND occupation:Linux* Should I create custom Indexer and Searcher? Or I could use standard ones and just map KeyItem and Pair as Lucene components somehow?

Read the article
Building simple Reddit scraper

- by Bazant Fundator

Let's say that I would like to make a collection of images from reddit for my own amusement. I have ran the code on my development env and It haven't gone past the first page of posts (anything beyond requries the after string from the JSON. Additionally, When I turn on the validation, the whole loop breaks if the item doesn't pass it, not just the current iteration. I would be glad If you helped me understand mistakes I made. class Link include Mongoid::Document include Mongoid::Timestamps field :author, type: String field :url, type: String validates_uniqueness_of :url, # no duplicates validates :url, uniqueness :true end def fetch (count, after) count_s = count.to_s # convert count to string link = "http://reddit.com/r/aww/.json?count="+count_s+"&after="+after #so it can be used there res = HTTParty.get(link) # GET req. to the reddit server json = JSON.parse(res.body) # Parse the response if json['kind'] == "Listing" then # check if the retrieved item is a Listing for i in 1...(count) do # for each list item datum = json['data']['children'][i]['data'] #i-th element properties if datum['domain'].in?(["imgur.com", "i.imgur.com"]) then # fetch only imgur links Link.create!(author: datum['author'], url: datum['url']) # save to db end end count += 25 fetch(count, json['data']['after']) # if it retrieved the right kind of object, move on to the next page end end fetch(25," ") # run it

Read the article
Subsonic - How to use SQL Schema / Owner name as part of the namespace?

- by CResults

Hi there, I've just started using Subsonic 2.2 and so far very impressed - think it'll save me some serious coding time. Before I dive into using it full time though there is something bugging me that I'd like to sort out. In my current database (a SQL2008 db) I have split the tables, views, sps etc. up into separate chunks by schema/owner name, so all the customer tables are in the customer. schema, products in the product. schema etc., so a to select from the customers address table i'd do a select * from customer.address Unfortunately, Subsonic ignores the schema/owner name and just gives me the base table name. This is fine as I've no duplicates between schemas (e.g Customer.Address and Supplier.Address don't both exist) but I just feel the code could be clearer if I could split by schema. Ideally I'd like to be able to alter the namespace by schema/owner - I think this would have least impact on SubSonic yet make the resulting code easier to read. Problem is, I've crawled all over the Subsonic source and don't have a clue how to do this (doesn't help that I code in VB not C# = yes I know, blame the ZX Spectrum!!) If anyone has tackled this before or has an idea on how to solve it, I'd be really grateful, Thanks in advance. Ed

Read the article
Cloning items in a listbox c#

- by Jenny

I have 2 list boxes and want to be able to copy selected items from one to the other how ever many times I want. Ive managed to do this but I have buttons on the 2nd list box that allow me to go up and down..Now when theres to items in the second list box that are the same (e.g "gills" and "gills") it doesnt behave normally and crashes. Is there a way in which I can get them to act as seperate items in the 2nd listbox? code private void buttonUp_Click(object sender, EventArgs e) { object selected = listBox2.SelectedItem; int index = list2.Items.IndexOf(selected); listBox2.Items.Remove(selected); listBox2.Items.Insert(index - 1, selected); listBox2.SetSelected(index - 1, true); } private void buttonAdd_Click(object sender, EventArgs e) { DataRowView selected = (DataRowView)listBox1.SelectedItem; string item = selected["title"].ToString(); listBox2.Items.Add(item); } It works fine when i havnt got duplicates but when i do they just jump around randomly when i press up/down. (ive not included down as its pretty much the same as up)

Read the article

< Previous Page | 24 25 26 27 28 29 30 31 32 33 34 35 | Next Page >