Search Results

Search found 5153 results on 207 pages for 'unique ptr'.

Page 48/207 | < Previous Page | 44 45 46 47 48 49 50 51 52 53 54 55  | Next Page >

  • SQL Server: Clustering by timestamp; pros/cons

    - by Ian Boyd
    I have a table in SQL Server, where i want inserts to be added to the end of the table (as opposed to a clustering key that would cause them to be inserted in the middle). This means I want the table clustered by some column that will constantly increase. This could be achieved by clustering on a datetime column: CREATE TABLE Things ( ... CreatedDate datetime DEFAULT getdate(), [timestamp] timestamp, CONSTRAINT [IX_Things] UNIQUE CLUSTERED (CreatedDate) ) But I can't guaranteed that two Things won't have the same time. So my requirements can't really be achieved by a datetime column. I could add a dummy identity int column, and cluster on that: CREATE TABLE Things ( ... RowID int IDENTITY(1,1), [timestamp] timestamp, CONSTRAINT [IX_Things] UNIQUE CLUSTERED (RowID) ) But you'll notice that my table already constains a timestamp column; a column which is guaranteed to be a monotonically increasing. This is exactly the characteristic I want for a candidate cluster key. So I cluster the table on the rowversion (aka timestamp) column: CREATE TABLE Things ( ... [timestamp] timestamp, CONSTRAINT [IX_Things] UNIQUE CLUSTERED (timestamp) ) Rather than adding a dummy identity int column (RowID) to ensure an order, I use what I already have. What I'm looking for are thoughts of why this is a bad idea; and what other ideas are better. Note: Community wiki, since the answers are subjective.

    Read the article

  • SQL Server: Clutering by timestamp; pros/cons

    - by Ian Boyd
    i have a table in SQL Server, where i want inserts to be added to the end of the table (as opposed to a clustering key that would cause them to be inserted in the middle). This means i want the table clustered by some column that will constantly increase. This could be achieved by clustering on a datetime column: CREATE TABLE Things ( ... CreatedDate datetime DEFAULT getdate(), [timestamp] timestamp, CONSTRAINT [IX_Things] UNIQUE CLUSTERED (CreatedDate) ) But i can't guaranteed that two Things won't have the same time. So my requirements can't really be achieved by a datetime column. i could add a dummy identity int column, and cluster on that: CREATE TABLE Things ( ... RowID int IDENTITY(1,1), [timestamp] timestamp, CONSTRAINT [IX_Things] UNIQUE CLUSTERED (RowID) ) But you'll notice that my table already constains a timestamp column; a column which is guaranteed to be a monotonically increasing. This is exactly the characteristic i want for a candidate cluster key. So i cluster the table on the rowversion (aka timestamp) column: CREATE TABLE Things ( ... [timestamp] timestamp, CONSTRAINT [IX_Things] UNIQUE CLUSTERED (timestamp) ) Rather than adding a dummy identity int column (RowID) to ensure an order, i use what i already have. What i'm looking for are thoughts of why this is a bad idea; and what other ideas are better. Note: Community wiki, since the answers are subjective.

    Read the article

  • Find three numbers appeared only once

    - by shilk
    In a sequence of length n, where n=2k+3, that is there are k unique numbers appeared twice and three numbers appeared only once. The question is: how to find the three unique numbers that appeared only once? for example, in sequence 1 1 2 6 3 6 5 7 7 the three unique numbers are 2 3 5. Note: 3<=n<1e6 and the number will range from 1 to 2e9 Memory limits: 1000KB , this implies that we can't store the whole sequence. Method I have tried(Memory limit exceed): I initialize a tree, and when read in one number I try to remove it from the tree, if the remove returns false(not found), I add it to the tree. Finally, the tree has the three numbers. It works, but is Memory limit exceed. I know how to find one or two such number(s) using bit manipulation. So I wonder if we can find three using the same method(or some method similar)? Method to find one/two number(s) appeared only once: If there is one number appeared only once, we can apply XOR to the sequence to find it. If there are two, we can first apply XOR to the sequence, then separate the sequence into 2 parts by one bit of the result that is 1, and again apply XOR to the 2 parts, and we will find the answer.

    Read the article

  • NSDictionary, NSArray, NSSet and efficiency

    - by ryyst
    Hi, I've got a text file, with about 200,000 lines. Each line represents an object with multiple properties. I only search through one of the properties (the unique ID) of the objects. If the unique ID I'm looking for is the same as the current object's unique ID, I'm gonna read the rest of the object's values. Right now, each time I search for an object, I just read the whole text file line by line, create an object for each line and see if it's the object I'm looking for - which is basically the most inefficient way to do the search. I would like to read all those objects into memory, so I can later search through them more efficiently. The question is, what's the most efficient way to perform such a search? Is a 200,000-entries NSArray a good way to do this (I doubt it)? How about an NSSet? With an NSSet, is it possible to only search for one property of the objects? Thanks for any help! -- Ry

    Read the article

  • Server side form validation and POST data

    - by tomcritchlow
    Hi, I have a user input form here: http://www.7bks.com/create (Google login required) When you first create a list you are asked to create a public username. Unfortuantely currently there is no constraint to make this unique. I'm working on the code to enforce unique usernames at the moment and would like to know the best way to do it. Tech details: appengine, python, webapp framework What I'm planning is something like this: first the /create form posts the data to /inputlist/ (this is the same as currently happens) /inputlist/ queries the datastore for the given username. If it already exists then redirect back to /create display the /create page with all the info previously but with an additional error message of "this username is already taken" My question is: Is this the best way of handling server side validation? What's the best way of storing the list details while I verify and modify the username? As I see it I have 3 options to store the list details but I'm not sure which is "best": Store the list details in the session cookie (I am using GAEsessions for cookies) Define a separate POST class for /create and post the list data back from /inputlist/ to the /create page (currently /create only has a GET class) Store the list in the datastore, even though the username is non-unique. Thank you very much for your help :) I'm pretty new to python and coding in general so if I've missed something obvious my apologies. Tom PS - I'm sure I can eventually figure it out but I can't find any documentation on POSTing data using the webapp appengine framework which I'd need in order to do solution 2 above :s maybe you could point me in the right direction for that too? Thanks! PPS - It's a little out of date now but you can see roughly how the /create and /inputlist/ code works at the moment here: 7bks.com Gist

    Read the article

  • Finds in Rails 3 and ActiveRelation

    - by TheDelChop
    Guys, I'm trying to understand the new arel engine in Rails 3 and I've got a question. I've got two models, User and Task class User < ActiveRecord::Base has_many :tasks end class Task < ActiveRecord::Base belongs_to :user end here is my routes to imply the relation: resources :users do resources :tasks end and here is my Tasks controller: class TasksController < ApplicationController before_filter :load_user def new @task = @user.tasks.new end private def load_user @user = User.where(:id => params[:user_id]) end end Problem is, I get the following error when I try to invoke the new action: NoMethodError: undefined method `tasks' for #<ActiveRecord::Relation:0x3dc2488> I am sure my problem is with the new arel engine, does anybody understand what I'm doing wrong? Sorry guys, here is my schema.db file: ActiveRecord::Schema.define(:version => 20100525021007) do create_table "tasks", :force => true do |t| t.string "name" t.integer "estimated_time" t.datetime "created_at" t.datetime "updated_at" t.integer "user_id" end create_table "users", :force => true do |t| t.string "email", :default => "", :null => false t.string "encrypted_password", :limit => 128, :default => "", :null => false t.string "password_salt", :default => "", :null => false t.string "reset_password_token" t.string "remember_token" t.datetime "remember_created_at" t.integer "sign_in_count", :default => 0 t.datetime "current_sign_in_at" t.datetime "last_sign_in_at" t.string "current_sign_in_ip" t.string "last_sign_in_ip" t.datetime "created_at" t.datetime "updated_at" t.string "username" end add_index "users", ["email"], :name => "index_users_on_email", :unique => true add_index "users", ["reset_password_token"], :name => "index_users_on_reset_password_token", :unique => true add_index "users", ["username"], :name => "index_users_on_username", :unique => true end Thank you, Joe

    Read the article

  • Changing the indexing on existing table in SQL Server 2000

    - by Raj
    Guys, Here is the scenario: SQL Server 2000 (8.0.2055) Table currently has 478 million rows of data. The Primary Key column is an INT with IDENTITY. There is an Unique Constraint imposed on two other columns with a Non-Clustered Index. This is a vendor application and we are only responsible for maintaining the DB. Now the vendor has recommended doing the following "to improve performance" Drop the PK and Clustered Index Drop the non-clustered index on the two columns with the UNIQUE CONSTRAINT Recreate the PK, with a NON-CLUSTERED index Create a CLUSTERED index on the two columns with the UNIQUE CONSTRAINT I am not convinced that this is the right thing to do. I have a number of concerns. By dropping the PK and indexes, you will be creating a heap with 478 million rows of data. Then creating a CLUSTERED INDEX on two columns would be a really mammoth task. Would creating another table with the same structure and new indexing scheme and then copying the data over, dropping the old table and renaming the new one be a better approach? I am also not sure how the stored procs will react. Will they continue using the cached execution plan, considering that they are not being explicitly recompiled. I am simply not able to understand what kind of "performance improvement" this change will provide. I think that this will actually have the reverse effect. All thoughts welcome. Thanks in advance, Raj

    Read the article

  • Change with jQuery a cell of a table created with JSF

    - by perissf
    From within a xhtml page created with JSF, I need to use JavaScript / jQuery for changing the content of a cell of a table. I know how to assign a unique id to the div containing the table, and to the tbody. I can also assign unique class names to the div itself and to the target column. The target row is identified by the data-rk attribute. <div id="tabForm:centerTabView:personsTable" class="ui-datatable ui-widget personsTable"> <table role="grid"> <tbody id="tabForm:centerTabView:personsTable_data" > <tr data-rk="2" > <td ... /> <td class="lastNameCol" role="gridcell"> <div> To Be Edited </div> </td> <td ... /> </tr> <tr ... /> </tbody> </table> </div> I have tried with many combinations of different jQuery selectors, but I am really lost. I need to search my target row and my target column inside that particular div or inside that particular table, because the xhtml page may contain other tables with different unique ids (and accidentally with the same row and column ids).

    Read the article

  • [Database] How to model this one-to-one relation?

    - by pbean
    I have several entities which respresent different types of users who need to be able to log in to a particular system. Additionally, they have different types of information associated with them. For example: a "general user", which has an e-mail address and "admin user", which has a workstation number (note that this a hypothetical case). Both entities also share common properties like first name, surname, address and telephone number. Finally, they naturally need to have a (unique) user name and a password to log in. In the application, the user just has to fill in his user name and password, and the functionality of the application changes slightly according to the type of the user. You can imagine that the username needs to be unique for this work. How should I model this effectively? I can't just create two tables, because then I can't force a unique constaint on the user name. I also can't put them all in just one table, because they have different types of specific information associated to them. I think I might need 3 seperate tables, one for "users" (with user name and password), one for the "general users" and another one for the "admin users", but how would the relations between these work? Or is there another solution? (By the way, the target DBMS is MySQL, so I don't think generalization is supported in the database system itself).

    Read the article

  • What category of combinatorial problems appear on the logic games section of the LSAT?

    - by Merjit
    There's a category of logic problem on the LSAT that goes like this: Seven consecutive time slots for a broadcast, numbered in chronological order I through 7, will be filled by six song tapes-G, H, L, O, P, S-and exactly one news tape. Each tape is to be assigned to a different time slot, and no tape is longer than any other tape. The broadcast is subject to the following restrictions: L must be played immediately before O. The news tape must be played at some time after L. There must be exactly two time slots between G and P, regardless of whether G comes before P or whether G comes after P. I'm interested in generating a list of permutations that satisfy the conditions as a way of studying for the test and as a programming challenge. However, I'm not sure what class of permutation problem this is. I've generalized the type problem as follows: Given an n-length array A: How many ways can a set of n unique items be arranged within A? Eg. How many ways are there to rearrange ABCDEFG? If the length of the set of unique items is less than the length of A, how many ways can the set be arranged within A if items in the set may occur more than once? Eg. ABCDEF = AABCDEF; ABBCDEF, etc. How many ways can a set of unique items be arranged within A if the items of the set are subject to "blocking conditions"? My thought is to encode the restrictions and then use something like Python's itertools to generate the permutations. Thoughts and suggestions are welcome.

    Read the article

  • Creating an appropriate index for a frequently used query in SQL Server

    - by Slauma
    In my application I have two queries which will be quite frequently used. The Where clauses of these queries are the following: WHERE FieldA = @P1 AND (FieldB = @P2 OR FieldC = @P2) and WHERE FieldA = @P1 AND FieldB = @P2 P1 and P2 are parameters entered in the UI or coming from external datasources. FieldA is an int and highly on-unique, means: only two, three, four different values in a table with say 20000 rows FieldB is a varchar(20) and is "almost" unique, there will be only very few rows where FieldB might have the same value FieldC is a varchar(15) and also highly distinct, but not as much as FieldB FieldA and FieldB together are unique (but do not form my primary key, which is a simple auto-incrementing identity column with a clustered index) I'm wondering now what's the best way to define an index to speed up specifically these two queries. Shall I define one index with... FieldB (or better FieldC here?) FieldC (or better FieldB here?) FieldA ... or better two indices: FieldB FieldA and FieldC FieldA Or are there even other and better options? What's the best way and why? Thank you for suggestions in advance!

    Read the article

  • Replacing all GUIDs in a file with new GUIDs from the command line

    - by Josh Petrie
    I have a file containing a large number of occurrences of the string Guid="GUID HERE" (where GUID HERE is a unique GUID at each occurrence) and I want to replace every existing GUID with a new unique GUID. This is on a Windows development machine, so I can generate unique GUIDs with uuidgen.exe (which produces a GUID on stdout every time it is run). I have sed and such available (but no awk oddly enough). I am basically trying to figure out if it is possible (and if so, how) to use the output of a command-line program as the replacement text in a sed substitution expression so that I can make this replacement with a minimum of effort on my part. I don't need to use sed -- if there's another way to do it, such as some crazy vim-fu or some other program, that would work as well -- but I'd prefer solutions that utilize a minimal set of *nix programs since I'm not really on *nix machines. To be clear, if I have a file like this: etc etc Guid="A" etc etc Guid="B" I would like it to become this: etc etc Guid="C" etc etc Guid="D" where A, B, C, D are actual GUIDs, of course. (for example, I have seen xargs used for things similar to this, but it's not available on the machines I need this to run on, either. I could install it if it's really the only way, although I'd rather not)

    Read the article

  • customer.name joining transactions.name vs. customer.id [serial] joining transactions.id [integer]

    - by Frank Computer
    INFORMIX-SQL 7.32 Pawnshop Application: one-to-many relationship where each customer (master) can have many transactions (detail). customer( id serial, pk_name char(30), {PATERNAL-NAME MATERNAL-NAME, FIRST-NAME MIDDLE-NAME} [...] ); unique index on id; unique cluster index on name; transaction( fk_name char(30), ticket_number serial, [...] ); dups cluster index on fk_name; unique index on ticket_number; Several people have told me this is not the correct way to join master to detail. They said I should always join customer.id[serial] to transactions.id[integer]. When a customer pawns merchandise, clerk queries the master using wildcards on name. The query usually returns several customers, clerk scrolls until locating the right name, enters a 'D' to change to detail transactions table, all transactions are automatically queried, then clerk enters an 'A' to add a new transaction. The problem with using customer.id joining transaction.id is that although the customer table is maintained in sorted name order, clustering the transaction table by fk_id groups the transactions by fk_id, but they are not in the same order as the customer name, so when clerk is scrolling through customer names in the master, the system has to jump allover the place to locate the clustered transactions belonging to each customer. As each new customer is added, the next id is assigned to that customer, but new customers dont show up in alphabetical order. I experimented using id joins and confirmed the decrease in performance. How can I use id joins instead of name joins and still preserve the clustered transaction order by name if transactions has no name column?

    Read the article

  • When is the reintegrate option really necessary?

    - by Tor Hovland
    If you always sync a feature branch before you merge it back, why do you really have to use the --reintegrate option? The Subversion book says: When merging your branch back to the trunk, however, the underlying mathematics is quite different. Your feature branch is now a mishmosh of both duplicated trunk changes and private branch changes, so there's no simple contiguous range of revisions to copy over. By specifying the --reintegrate option, you're asking Subversion to carefully replicate only those changes unique to your branch. (And in fact, it does this by comparing the latest trunk tree with the latest branch tree: the resulting difference is exactly your branch changes!) So the --reintegrate option only merges the changes that are unique to the feature branch. But if you always sync before merge (which is a recommended practice, in order to deal with any conflicts on the feature branch), then the only changes between the branches are the changes that are unique to the feature branch, right? And if Subversion tries to merge code that is already on the target branch, it will just do nothing, right? In this blog post, Mark Phippard writes: http://blogs.open.collab.net/svn/2008/07/subversion-merg.html If we include those synched revisions, then we merge back changes that already exist in trunk. This yields unnecessary and confusing conflicts. Can somebody give me an example of when dropping reintegrate gives me unnecessary conflicts?

    Read the article

  • How to map combinations of things to a relational database?

    - by Space_C0wb0y
    I have a table whose records represent certain objects. For the sake of simplicity I am going to assume that the table only has one row, and that is the unique ObjectId. Now I need a way to store combinations of objects from that table. The combinations have to be unique, but can be of arbitrary length. For example, if I have the ObjectIds 1,2,3,4 I want to store the following combinations: {1,2}, {1,3,4}, {2,4}, {1,2,3,4} The ordering is not necessary. My current implementation is to have a table Combinations that maps ObjectIds to CombinationIds. So every combination receives a unique Id: ObjectId | CombinationId ------------------------ 1 | 1 2 | 1 1 | 2 3 | 2 4 | 2 This is the mapping for the first two combinations of the example above. The problem is, that the query for finding the CombinationId of a specific Combination seems to be very complex. The two main usage scenarios for this table will be to iterate over all combinations, and the retrieve a specific combination. The table will be created once and never be updated. I am using SQLite through JDBC. Is there any simpler way or a best practice to implement such a mapping?

    Read the article

  • what's the performance difference between int and varchar for primary keys

    - by user568576
    I need to create a primary key scheme for a system that will need peer to peer replication. So I'm planning to combine a unique system ID and a sequential number in some way to come up with unique ID's. I want to make sure I'll never run out of ID's, so I'm thinking about using a varchar field, since I could always add another character if I start running out. But I've read that integers are better optimized for this. So I have some questions... 1) Are integers really better optimized? And if they are, how much of a performance difference is there between varchars and integers? I'm going to use firebird for now. But I may switch later. Or possibly support multiple db's. So I'm looking for generalizations, if that's possible. 2) If integers are significantly better optimized, why is that? And is it likely that varchars will catch up in the future, so eventually it won't matter anyway? My varchar keys won't have any meaning, except for the unique system ID part. But I may want to obscure that somehow. Also, I plan to efficiently use all the bits of each character. I don't, for example, plan to code the integer 123 as the character string "123". So I don't think varchars will require more space than integers.

    Read the article

  • Is it possible to limit outside connections to a subdomain with .htaccess or similar?

    - by digidave0205
    I host a web application. This application serves static html pages that are refreshed at various intervals. Some as often as every 30 secs. At this time I have about 300 unique pages that are accessed via 300 unique subdomains. Some clients have at most 50 visitors to their unique page and it refreshes every 30 secs, no problem. Other clients have up to 1000 or more visitors to their page. These clients are killing my server. There was no predefined limit upon signup but now I have to impose such a limit to remain afloat financially. I would like to define a finite number of connections allowed for each individual subdomain in my hosting account. Connections attempted out of range of this finite value would either be rejected or redirected. I have access to .htaccess and php.ini. Is something of this nature possible? Oh, I have a dedicated/managed server at 1and1.

    Read the article

  • Speed/expensive of SQLite query vs. List.contains() for "in-set" icon on list rows

    - by kpdvx
    An application I'm developing requires that the app main a local list of things, let's say books, in a local "library." Users can access their local library of books and search for books using a remote web service. The app will be aware of other users of the app through this web service, and users can browse other users' lists of books in their library. Each book is identified by a unique bookId (represented as an int). When viewing books returned through a search result or when viewing another user's book library, the individual list row cells need to visually represent if the book is in the user's local library or not. A user can have at most 5,000 books in the library, stored in SQLite on the device (and synchronized with the remote web service). My question is, to determine if the book shown in the list row is in the user's library, would it be better to directly ask SQLite (via SELECT COUNT(*)...) or to maintain, in-memory, a List or int[] array of some sort containing the unique bookIds. So, on each row display do I query SQLite or check if the List or int[] array contains the unique bookId? Because the user can have at most 5,000 books, each bookId occupies 4 bytes so at most this would use ~ 20kB. In thinking about this, and in typing this out, it seems obvious to me that it would be far better for performance if I maintained a list or int[] array of in-library bookIds vs. querying SQLite (the only caveat to maintaining an int[] array is that if books are added or removed I'll need to grow or shrink the array by hand, so with this option I'll most likely use an ArrayList or Vector, though I'm not sure of the additional memory overhead of using Integer objects as opposed to primitives). Opinions, thoughts, suggestions?

    Read the article

  • REST Rails 2 nested routes without resource names?

    - by mrbrdo
    I'm using Rails 2. I have resources nested like this: - university_categories - universities - studies - professors - comments I wish to use RESTful routes, but I don't want all that clutter in my URL. For example instead of: /universities/:university_id/studies/:study_id/professors/:professor_id I want: /professors/:university_id/:study_id/:professor_id (I don't map professors seperately so there shouldn't be a confusion between this and /professors/:professor_id since that route shouldn't exist). Again, I want to use RESTful resources/routes... Also note, I am using slugs instead of IDs. Slugs for studies are NOT unique, while other are. Also, there are no many-to-many relationships (so if I know the slug of a professor, which is unique, I also know which study and university and category it belongs to, however I still wish this information to be in the URI if possible for SEO, and also it is necessary when adding a new professor). I do however want to use shallow nesting for "administrator" URIs like edit, destroy (note the problem here with Study since it's slug is not unique, though)... I would also like some tips on how to use the url helpers so that I don't have too much to fix if I change the routes in the future... Thank you.

    Read the article

  • MySQL PHP | "SELECT FROM table" using "alphanumeric"-UUID. Speed vs. Indexed Integer / Indexed Char

    - by dropson
    At the moment, I select rows from 'table01' using: SELECT * FROM table01 WHERE UUID = 'whatever'; The UUID column is a unique index. I know this isn't the fastest way to select data from the database, but the UUID is the only row-identifier that is available to the front-end. Since I have to select by UUID, and not ID, I need to know what of these two options I should go for, if say the table consists of 100'000 rows. What speed differences would I look at, and would the index for the UUID grow to large, and lag the DB? Get the ID before doing the "big" select 1. $id = "SELECT ID FROM table01 WHERE UUID = '{alphanumeric character}'"; 2. $rows = SELECT * FROM table01 WHERE ID = $id; Or keep it the way it is now, using the UUID. 1. SELECT FROM table01 WHERE UUID '{alphanumeric character}'; Side note: All new rows are created by checking if the system generated uniqueid exists before trying to insert a new row. Keeping the column always unique. The "example" table. CREATE TABLE Table01 ( ID int NOT NULL PRIMARY KEY, UUID char(15), name varchar(100), url varchar(255), `date` datetime ) ENGINE = InnoDB; CREATE UNIQUE INDEX UUID ON Table01 (UUID);

    Read the article

  • question about prefix and suffix in java

    - by kate
    hi!!I have an issue and i want your help!I have a specific REGEX which i have named it "unique" and i want to store the data that it gives me!! The prefix and the suffix of this regex!!so!!someone has told me to use var prefix and var suffix but i don't know how to do it! for your convinience i give u my xml for this regex <reg name="unique"> <start><![CDATA[ <!-- 72 HOURS FORECASTS --> ]]></start> <end><![CDATA[<!-- 72 HOURS FORECASTS -->]]></end> </reg> the code where i call this regex is int k = 0; for(Xml reg_is:fetchsite.child("site").child("regexps").children("reg")) { if(reg_is.string("name").contains("unique")){ if(reg_is.child("start").content()=="") error += "\tNo prefix reg.exp. given.\n"; else prefix = HtmlMethods.removeBreaks(replaceVariables(reg_is.child("start").content())); if(reg_is.child("end").content()=="") error += "\tNo suffix reg.exp. given.\n"; else suffix = HtmlMethods.removeBreaks(replaceVariables(reg_is.child("end").content())); } else{ poleis[k][0]= HtmlMethods.removeBreaks(reg_is.string("name")); poleis[k][1] = HtmlMethods.removeBreaks(replaceVariables(reg_is.child("start").content())); poleis[k][2] = HtmlMethods.removeBreaks(replaceVariables(reg_is.child("end").content())); k++; }

    Read the article

  • How would I sort files to directories based on filenames?

    - by gnomed
    I have a huge number of files to sort all named in some terrible convention. Here are some examples: (4)_mr__mcloughlin____.txt 12__sir_john_farr____.txt (b)mr__chope____.txt dame_elaine_kellett-bowman____.txt dr__blackburn______.txt These names are supposed to be a different person (speaker) each. Someone in another IT department produced these from a ton of XML files using some script but the naming is unfathomably stupid as you can see. I need to sort literally tens of thousands of these files with multiple files of text for each person; each with something stupid making the filename different, be it more underscores or some random number. They need to be sorted by speaker. This would be easier with a script to do most of the work then I could just go back and merge folders that should be under the same name or whatever. There are a number of ways I was thinking about doing this. parse the names from each file and sort them into folders for each unique name. get a list of all the unique names from the filenames, then look through this simplified list of unique names for similar ones and ask me whether they are the same, and once it has determined this it will sort them all accordingly. I plan on using Perl, but I can try a new language if it's worth it. I'm not sure how to go about reading in each filename in a directory one at a time into a string for parsing into an actual name. I'm not completely sure how to parse with regex in perl either, but that might be googleable. For the sorting, I was just gonna use the shell command: `cp filename.txt /example/destination/filename.txt` but just cause that's all I know so it's easiest. I dont even have a pseudocode idea of what im going to do either so if someone knows the best sequence of actions, im all ears. I guess I am looking for a lot of help, I am open to any suggestions. Many many many thanks to anyone who can help. B.

    Read the article

  • reformatting a matrix in matlab with nan values

    - by Kate
    This post follows a previous question regarding the restructuring of a matrix: re-formatting a matrix in matlab An additional problem I face is demonstrated by the following example: depth = [0:1:20]'; data = rand(1,length(depth))'; d = [depth,data]; d = [d;d(1:20,:);d]; Here I would like to alter this matrix so that each column represents a specific depth and each row represents time, so eventually I will have 3 rows (i.e. days) and 21 columns (i.e. measurement at each depth). However, we cannot reshape this because the number of measurements for a given day are not the same i.e. some are missing. This is known by: dd = sortrows(d,1); for i = 1:length(depth); e(i) = length(dd(dd(:,1)==depth(i),:)); end From 'e' we find that the number of depth is different for different days. How could I insert a nan into the matrix so that each day has the same depth values? I could find the unique depths first by: unique(d(:,1)) From this, if a depth (from unique) is missing for a given day I would like to insert the depth to the correct position and insert a nan into the respective location in the column of data. How can this be achieved?

    Read the article

  • PHP | Online Notepad

    - by user2947423
    I recently made a Chat application in Visual Basic using PHP. I used this code: <?php $msg = $_GET['w']; $logfile= 'Chats.php'; $fp = fopen($logfile, "a"); fwrite($fp, $msg); fclose($fp); ?> I'm now trying to make a Online Notepad. What i want to do is in Visual Basic create a unique ID. That unique ID, has to be his filename. I'm not very good with PHP so what i want to know is: I want the unique ID to be the filename of the "Note". Like: $logfile= '{uniqueID.php}'; Whenever the user opens the program, it'll open his uniqueID.php file and he can edit that in my program. Long Story Short (TL;DR) Program generates uniqueID uniqueID is going to be a new file; {uniqueID}.php On next open it will check if {uniqueID}.php of him/her exists else it will make a new one. I know this isn't really secure but it's to learn something for myself.

    Read the article

  • MySQL - Exclude rows from Select based on duplication of two columns

    - by Carson C.
    I am attempting to narrow results of an existing complex query based on conditional matches on multiple columns within the returned data set. I'll attempt to simplify the data as much as possible here. Assume that the following table structure represents the data that my existing complex query has already selected (here ordered by date): +----+-----------+------+------------+ | id | remote_id | type | date | +----+-----------+------+------------+ | 1 | 1 | A | 2011-01-01 | | 3 | 1 | A | 2011-01-07 | | 5 | 1 | B | 2011-01-07 | | 4 | 1 | A | 2011-05-01 | +----+-----------+------+------------+ I need to select from that data set based on the following criteria: If the pairing of remote_id and type is unique to the set, return the row always If the pairing of remote_id and type is not unique to the set, take the following action: Of the sets of rows for which the pairing of remote_id and type are not unique, return only the single row for which date is greatest and still less than or equal to now. So, if today is 2010-01-10, I'd like the data set returned to be: +----+-----------+------+------------+ | id | remote_id | type | date | +----+-----------+------+------------+ | 3 | 1 | A | 2011-01-07 | | 5 | 1 | B | 2011-01-07 | +----+-----------+------+------------+ For some reason I'm having no luck wrapping my head around this one. I suspect the answer lies in good application of group_by, but I just can't grasp it. Any help is greatly appreciated!

    Read the article

< Previous Page | 44 45 46 47 48 49 50 51 52 53 54 55  | Next Page >