Search Results

Search found 13927 results on 558 pages for 'programming theory'.

Page 393/558 | < Previous Page | 389 390 391 392 393 394 395 396 397 398 399 400 | Next Page >

Dimension Reduction in Categorical Data with missing values

- by user227290

I have a regression model in which the dependent variable is continuous but ninety percent of the independent variables are categorical(both ordered and unordered) and around thirty percent of the records have missing values(to make matters worse they are missing randomly without any pattern, that is, more that forty five percent of the data hava at least one missing value). There is no a priori theory to choose the specification of the model so one of the key tasks is dimension reduction before running the regression. While I am aware of several methods for dimension reduction for continuous variables I am not aware of a similar statical literature for categorical data (except, perhaps, as a part of correspondence analysis which is basically a variation of principal component analysis on frequency table). Let me also add that the dataset is of moderate size 500000 observations with 200 variables. I have two questions. Is there a good statistical reference out there for dimension reduction for categorical data along with robust imputation (I think the first issue is imputation and then dimension reduction)? This is linked to implementation of above problem. I have used R extensively earlier and tend to use transcan and impute function heavily for continuous variables and use a variation of tree method to impute categorical values. I have a working knowledge of Python so if something is nice out there for this purpose then I will use it. Any implementation pointers in python or R will be of great help. Thank you.

Read the article
can bind successfully to the ldap server, but needs to know how to find user w/i AD

- by Brad

I create a login form to bind to the ldap server, if successful, it creates a session (which the user's username is stored within), then I go to another page that has session_start(); and it works fine. What I want to do now, is add code to test if that user is a member of a specific group. So in theory, this is what I want to do if(username session is valid) { search ldap for user -> get list of groups user is member of foreach(group they are member of) { switch(group) { case STAFF: print 'they are member of staff group'; $access = true; break; default: print 'not a member of STAFF group'; $access = false; break; } if(group == STAFF) { break; } } if($access == TRUE) { // you have access to the content on this page } else { // you do not have access to this page } } How do I do a ldap_search w/o binding? I don't want to keep asking for their password on each page, and I can't pass their password thru a session. Any help is appreciated.

Read the article
Will TFS 2010 support non-contiguous merging?

- by steve_d

I know that merging non-contiguous changesets at once may not be a good idea. However there is at least one situation in which merging non-contiguous changesets is (probably) not going to break anything: when there are no intervening changes on the affected individual files. (At least, it wouldn't break any worse than would a series of cherry-picked merges, checked in each time; and at least this way you would discover breakage before checking in). For instance, let's say you have a Main and a Development branch. They start out identical (e.g. after a release). They have two files, foo.cs and bar.cs. Alice makes a change in Development\foo.cs and checks it in as changeset #1001. Bob makes a change in Development\bar.cs and checks it in as #1002. Alice makes another change to Development\foo.cs and checks it in as #1003. Now we could in theory merge both changes #1001 and #1003 from dev-to main in a single operation. If we try to merge at the branch level, dev-to-main, we will have to do it as two operations. In this simple, contrived example it's simple enough to merge the one file - but in the real world where there would be many files involved, it's not so simple. Non-contiguous merging is one of the reasons given for why "merge by workitem" is not implemented in TFS.

Read the article
struct constructor + function parameter

- by Oops

Hi, I am a C++ beginner. I have the following code, the reult is not what I expect. The question is why, resp. what is wrong. For sure, the most of you see it at the first glance. struct Complex { float imag; float real; Complex( float i, float r) { imag = i; real = r; } Complex( float r) { Complex(0, r); } std::string str() { std::ostringstream s; s << "imag: " << imag << " | real: " << real << std::endl; return s.str(); } }; class Complexes { std::vector<Complex> * _complexes; public: Complexes(){ _complexes = new std::vector<Complex>; } void Add( Complex elem ) { _complexes->push_back( elem ); } std::string str( int index ) { std::ostringstream oss; Complex c = _complexes->at(index); oss << c.str(); return oss.str(); } }; int main(){ Complexes * cs = new Complexes(); //cs->Add(123.4f); cs->Add(Complex(123.4f)); std::cout << cs->str(0); return 0; } for now I am interested in the basics of c++ not in the complexnumber theory ;-) it would be nice if the "Add" function does also accept one real (without an extra overloading) instead of only a Complex-object is this possible? many thanks in advance Oops

Read the article
Is Subversion's 'Lazy Copy' still lazy when overwriting a previously deleted file?

- by JW

Is Subversion's 'Lazy Copy' still lazy when overwriting a previously deleted file? I store my externals in a separate folder for each version: i.e say for dojo I'd have: webroot\ scripts\ dojo-v-1.0.0\ dojo-v-1.1.0\ etc. By doing this, for me at least, I feel it makes it easier to switch over to a new version. By only adding each new version i am not really giving svn the history it needs to do lazy copies. So one tactic I have used is to svn copy over the old version over to where the new one will be then svn delete that whole folder then unpack my newer version into that place then svn add them The idea is to avoid having a massive amount of duplicated data in my repo. I hope svn is looking at the new files and saying, "hey, i already had this once, copied, then deleted...so i am only going to be lazy store the changes". That was my theory - but does that happen in practice? p.s. Yes I know an alternative is to set the 'externals properties on the folder' - but that's another question.

Read the article
escape exactly what in javascript

- by Emin

Hi all, Being a newbie in javascript I came to a situation where I need more information on escaping characters in a string. Basically I know that in order to escape " I need to replace it with \" but what I don't know is for which characters I need to escape a particular string for. Is there a list of these "characters to escape"? or is it any character that is not a-zA-Z0-9 ? In my situation, I don't have control over the content that is being displayed on my page. Users enter some text and save it. I then use a webservice to extract them from the database, build a json array of objects, then iterate the array when I need to display them. In this case, I have - naturally - no idea of what the text the user has entered and therefore for what characters I need to escape. I also use jQuery for this specific project (just in case it has a function I am not aware of, to do what I need) Providing examples would be appreciated but I also want to learn the theory and logic behind it. Hope someone can be of any help.

Read the article
Android 2.1 gallery not backward compatible with Cupcake version, now what?

- by Schermvlieger

I don't know why, but in Eclair, the default (non-fancy) gallery app changed its begaviour from the Cupcake version, and it broke one of my commercial applications :-( Firstly, when long-pressing a gallery and choosing "Diashow", it does not publish an Intent to be picked up by any application that implements the Intent filter anymore. Instead, it will directly call "com.android.gallery/com.android.camera.ViewImage" with extras. Question: is it still possible to intercept this intent and allow the user to choose my application to do the Diashow? Secondly, the intent extras for the VIEW intent are messed up (in my build of 2.1 anyway): Instead of providing the BucketId of the picture in the Intent's queryparameter. But in 2.1, the BucketId is moved to the Intent's extras. Except; it is not passing the BUCKET_ID, but the unlocalized BUCKET_DISPLAY_NAME instead :-/ Question: how can I still get the unique BUCKET_ID from the intent, so that I do not have to work with a potentially non-unique BUCKET_DISPLAY_NAME? Is there anybody out there who has come up with a working solution for these problems? I thought the whole idea of Android Intents was to be able to integrate your applications with the base Android environment, but my build of 2.1 proves that this idea still lives in the land of Theory :-(

Read the article
Iterate Through JSON Data for Specific Element - Similar to XPath

- by Highroller

I am working on an embedded system and my theory of the overall process follows this methodology: 1. Send a command to the back end asking for account information in JSON format. 2. The back end writes a JSON file with all accounts and associated information (there could be 0 to 16 accounts). 3. Here's where I get stuck - use JavaScript (we are using the JQuery library) to iterate through the returned information for a specific element (similar to XPath) and build an array based on the number of elements found to populate a drop-down box to select the account you want to view and then do stuff with the account info. So my code looks like this: loadAccounts = function() { $.getJSON('/info?q=voip.accounts[]', function(result) { var sipAcnts = $("#sipacnts"); $(sipAcnts).empty(); // empty the dropdown (if necessarry) // Get the 'label' element and stick it in an array // build the array and append it to the sipAcnts dropdown // use array index to ref the accounts info and do stuff with it } So what I need is the JSON version of XPath to build the array of voip.accounts.label. The first account info looks something like this: { "result_set": { "voip.accounts[0]": { "label": "Dispatch1", "enabled": true, "user": "1234", "name": "Jane Doe", "type": "sip", "sip": { "lots and lots of stuff": }, } } } Am I over complicating the issue? Any wisdom anyone could thrown down would be greatly appreciated.

Read the article
How do I gather data from the same collumn in multiple worksheets in a single workbook?

- by infiniteloop91

Okay so here is what I want to accomplish. For this example I have a single workbook composed of 4 data sheets plus a totals sheet. Each of the 4 data sheets has a similar name following the same pattern where the only difference is the date. (Ex. 9854978_1009_US.txt, where 1009 is the date that changes while the rest of the file name is the same). In each of those documents column F contains a series of number that I would like to find the sum of but I will have no idea how many cells in F actually contain numbers. (However there will never be additional information below it the numbers so I could in theory just add the entire F column together). I will also add new files to the workbook over time and do not want to have to rewrite the code of which I gather my data from column F. Essentially what I would like to accomplish is for the 'totals' document to take every column F from documents in the workbook with the name of '9854978_????_US.txt', where the question marks change based on the file name. How would I go about doing this in pure Excel code?

Read the article
How to identify multiple identical pairs in two vectors

- by Sacha Epskamp

In my graph-package (as in graph theory, nodes connected by edges) I have a vector indicating for each edge the node of origin from, a vector indicating for each edge the node of destination to and a vector indicating the curve of each edge curve. By default I want edges to have a curve of 0 if there is only one edge between two nodes and curve of 0.2 if there are two edges between two nodes. The code that I use now is a for-loop, and it is kinda slow: curve <- rep(0,5) from<-c(1,2,3,3,2) to<-c(2,3,4,2,1) for (i in 1:length(from)) { if (any(from==to[i] & to==from[i])) { curve[i]=0.2 } } So basically I look for each edge (one index in from and one in to) if there is any other pair in from and to that use the same nodes (numbers). What I am looking for are two things: A way to identify if there is any pair of nodes that have two edges between them (so I can omit the loop if not) A way to speed up this loop # EDIT: To make this abit clearer, another example: from <- c(4L, 6L, 7L, 8L, 1L, 9L, 5L, 1L, 2L, 1L, 10L, 2L, 6L, 7L, 10L, 4L, 9L) to <- c(1L, 1L, 1L, 2L, 3L, 3L, 4L, 5L, 6L, 7L, 7L, 8L, 8L, 8L, 8L, 10L, 10L) cbind(from,to) from to [1,] 4 1 [2,] 6 1 [3,] 7 1 [4,] 8 2 [5,] 1 3 [6,] 9 3 [7,] 5 4 [8,] 1 5 [9,] 2 6 [10,] 1 7 [11,] 10 7 [12,] 2 8 [13,] 6 8 [14,] 7 8 [15,] 10 8 [16,] 4 10 [17,] 9 10 In these two vectors, pair 3 is identical to pair 10 (both 1 and 7 in different orders) and pairs 4 and 12 are identical (both 2 and 8). So I would want curve to become: [1,] 0.0 [2,] 0.0 [3,] 0.2 [4,] 0.2 [5,] 0.0 [6,] 0.0 [7,] 0.0 [8,] 0.0 [9,] 0.0 [10,] 0.2 [11,] 0.0 [12,] 0.2 [13,] 0.0 [14,] 0.0 [15,] 0.0 [16,] 0.0 [17,] 0.0 (as I vector, I transposed twice to get row numbers).

Read the article
Pointer and malloc issue

- by Andy

I am fairly new to C and am getting stuck with arrays and pointers when they refer to strings. I can ask for input of 2 numbers (ints) and then return the one I want (first number or second number) without any issues. But when I request names and try to return them, the program crashes after I enter the first name and not sure why. In theory I am looking to reserve memory for the first name, and then expand it to include a second name. Can anyone explain why this breaks? Thanks! #include <stdio.h> #include <stdlib.h> void main () { int NumItems = 0; NumItems += 1; char* NameList = malloc(sizeof(char[10])*NumItems); printf("Please enter name #1: \n"); scanf("%9s", NameList[0]); fpurge(stdin); NumItems += 1; NameList = realloc(NameList,sizeof(char[10])*NumItems); printf("Please enter name #2: \n"); scanf("%9s", NameList[1]); fpurge(stdin); printf("The first name is: %s",NameList[0]); printf("The second name is: %s",NameList[1]); return 0; }

Read the article
How do I do a table join on two fields in my second table?

- by Cannonade

I have two tables: Messages - Amongst other things, has a to_id and a from_id field. People - Has a corresponding person_id I am trying to figure out how to do the following in a single linq query: Give me all messages that have been sent to and from person x (idself). I had a couple of cracks at this. Not quite right MsgPeople = (from p in db.people join m in db.messages on p.person_id equals m.from_id where (m.from_id == idself || m.to_id == idself) orderby p.name descending select p).Distinct(); This almost works, except I think it misses one case: "people who have never received a message, just sent one to me" How this works in my head So what I really need is something like: join m in db.messages on (p.people_id equals m.from_id or p.people_id equals m.to_id) Gets me a subset of the people I am after It seems you can't do that. I have tried a few other options, like doing two joins: MsgPeople = (from p in db.people join m in AllMessages on p.person_id equals m.from_id join m2 in AllMessages on p.person_id equals m2.to_id where (m2.from_id == idself || m.to_id == idself) orderby p.name descending select p).Distinct(); but this gives me a subset of the results I need, I guess something to do with the order the joins are resolved. My understanding of LINQ (and perhaps even database theory) is embarrassingly superficial and I look forward to having some light shed on my problem.

Read the article
How can I get a list of modified records from a SQL Server database?

- by Pixelfish

I am currently in the process of revamping my company's management system to run a little more lean in terms of network traffic. Right now I'm trying to figure out an effective way to query only the records that have been modified (by any user) since the last time I asked. When the application starts it loads the job information and caches it locally like the following: SELECT * FROM jobs. I am writing out the date/time a record was modified ala UPDATE jobs SET Widgets=@Widgets, LastModified=GetDate() WHERE JobID=@JobID. When any user requests the list of jobs I query all records that have been modified since the last time I requested the list like the following: SELECT * FROM jobs WHERE LastModified>=@LastRequested and store the date/time of the request to pass in as @LastRequest when the user asks again. In theory this will return only the records that have been modified since the last request. The issue I'm running into is when the user's date/time is not quite in sync with the server's date/time and also of server load when querying an un-indexed date/time column. Is there a more effective system then querying date/time information?

Read the article
How to speed up a slow UPDATE query

- by Mike Christensen

I have the following UPDATE query: UPDATE Indexer.Pages SET LastError=NULL where LastError is not null; Right now, this query takes about 93 minutes to complete. I'd like to find ways to make this a bit faster. The Indexer.Pages table has around 506,000 rows, and about 490,000 of them contain a value for LastError, so I doubt I can take advantage of any indexes here. The table (when uncompressed) has about 46 gigs of data in it, however the majority of that data is in a text field called html. I believe simply loading and unloading that many pages is causing the slowdown. One idea would be to make a new table with just the Id and the html field, and keep Indexer.Pages as small as possible. However, testing this theory would be a decent amount of work since I actually don't have the hard disk space to create a copy of the table. I'd have to copy it over to another machine, drop the table, then copy the data back which would probably take all evening. Ideas? I'm using Postgres 9.0.0. UPDATE: Here's the schema: CREATE TABLE indexer.pages ( id uuid NOT NULL, url character varying(1024) NOT NULL, firstcrawled timestamp with time zone NOT NULL, lastcrawled timestamp with time zone NOT NULL, recipeid uuid, html text NOT NULL, lasterror character varying(1024), missingings smallint, CONSTRAINT pages_pkey PRIMARY KEY (id ), CONSTRAINT indexer_pages_uniqueurl UNIQUE (url ) ); I also have two indexes: CREATE INDEX idx_indexer_pages_missingings ON indexer.pages USING btree (missingings ) WHERE missingings > 0; and CREATE INDEX idx_indexer_pages_null ON indexer.pages USING btree (recipeid ) WHERE NULL::boolean; There are no triggers on this table, and there is one other table that has a FK constraint on Pages.PageId.

Read the article
Collision Detection (Ground & Slopes) in 2D Platform Game using Pygame Rects

- by RedCap

Hi, First off, I am not after any instructions on logic for collision detection; I get it. What I am trying to work out is the least complicated way to do this with Pygame using Sprites & Rects. I want to be able to check collisions for the Player against ground, walls & slopes. In theory it is quite straight forward, but I'm having difficulty because it seems like you cannot do this with one Rect. One Rect is simple enough to get you collisions in the X plane against walls. The same Rect could be used also be used in the Y plane against solids, but not with slopes - since with the collision routines in Pygame it checks the whole Rect (or mask), rather than perhaps just the bottom middle of the Rect. It seems in addition you need to have a number of "sprites" to check collisions with, that are 1x1 pixel in various places around the Player. What's the easiest way to do this, without having a bunch of 3, 4, or more separate "collision pixels" to check against slopes? Geoff

Read the article
"Find all tiles connected to this one" project

- by Omega

Remember MS Paint? The bucket tool? If you used it and clicked on a pixel, all pixels connected to this pixel that are the same are affected. The theory is, I suppose, to check if there is any pixel adjacent to the selected one. If such pixel is the same type as the selected one, check for more adjacent pixels in this one, and so on. I want to implement something similar in VB.NET. Basically I have a 2D array map which represents the map. Let's assume there are only two types of tile: 0 and 1. Now, I got pretty much everything ready: I got my 2d map and I can tell which tile is clicked and tell what array indexes are the ones that represent such tile. Now for the "painting" process. Whenever I think about it, I can't figure a convenient way to execute such iteration. Can someone help me choosing a correct design/way/tip to achieve this?

Read the article
Can Visual Studio (should it be able to) compute a diff between any two changesets associated with a

- by Hamish Grubijan

Here is my use case: I start on a project XYZ, for which I create a work item, and I make frequent check-ins, easily 10-20 in total. ALL of the code changes will be code-read and code-reviewed. The change sets are not consecutive - other people check-in in-between my changes, although they are very unlikely to touch the exact same files. So ... at the en of the project I am interested in a "total diff" - as if there was a single check-in by me to complete the entire project. In theory this is computable. From the list of change sets associated with the work item, you get the list of all files that were affected. Then, the algorithm can aggregate individual diffs over each file and combine them into one. It is possible that a pure total diff is uncomputable due to the fact that someone else renamed files, or changed stuff around very closely, or in the same functions as me. I that case ... I suppose a total diff can include those changes by non-me as well, and warn me about the fact. I would find this very useful, but I do not know how to do t in practice. Can Visual Studio 2008/2010 (and/or TFS server) do it? Are there other source control systems capable of doing this? Thanks.

Read the article
Is `super` local variable?

- by Michael

// A : Parent @implementation A -(id) init { // change self here then return it } @end A A *a = [[A alloc] init]; a. Just wondering, if self is a local variable or global? If it's local then what is the point of self = [super init] in init? I can successfully define some local variable and use like this, why would I need to assign it to self. -(id) init { id tmp = [super init]; if(tmp != nil) { //do stuff } return tmp; } b. If [super init] returns some other object instance and I have to overwrite self then I will not be able to access A's methods any more, since it will be completely new object? Am I right? c. super and self pointing to the same memory and the major difference between them is method lookup order. Am I right? sorry, don't have Mac to try, learning theory as for now...

Read the article
Rails: Duplicate functionality across controllers? A humble plea.

- by Alex

So I'm working with authlogic, and I'm trying to duplicate the login functionality to the welcome page, so that you can log in by restful url or by just going to the main page. No, I don't know if we'll keep that feature, but I want to test it out anyway. Here's the error message: RuntimeError in Welcome#index Called id for nil, which would mistakenly be 4 -- if you really wanted the id of nil, use object_id The code is below. Basically, what's happening is the index view (the first code snippet) is sending the information from the form to the create method of user_sessions controller. At this point, in theory, it create should just pick up, but it doesn't. PLEASE help. Please. I've been doing this for about 8 hours. I checked Google. I checked IRC. I checked every book I could find. You don't even have to answer, I can to the grunt work if you just point me in the right direction. <% form_for @user_session, :url => user_sessions_path do |f| %> <%= f.text_field :email %><br /> <%= f.password_field :password %> <%= submit_tag 'Login' %> <% end %> class ApplicationController < ActionController::Base helper :all # include all helpers, all the time protect_from_forgery # See ActionController::RequestForgeryProtection for details # Scrub sensitive parameters from your log # filter_parameter_logging :password helper_method :current_user_session, :current_user before_filter :new_session_object protected def new_session_object unless current_user @user_session = UserSession.new(params[:user_session]) end end private def current_user_session return @current_user_session if defined?(@current_user_session) @current_user_session = UserSession.find end def current_user return @current_user if defined?(@current_user) @current_user = current_user_session && current_user_session.record end end

Read the article
Quickly generate junk data of certain size in Javascript

- by user1357607

I am writing an upload speed test in Javascript. I am using Jquery (and Ajax) to send chunks of data to a server in order to time how long it takes to get a response. This should, in theory give an estimation, of the upload speed. Of course to cater for different bandwidths of the user I sequentially upload larger and larger amounts of junk data until a threshold duration is reached. Currently I generate the junk data using the following function, however, it is very slow when generation megabytes of data. function generate_random_data(size){ var chars = "abcdefghijklmnopqrstuvwxyz"; var random_data = ""; for (var i = 0; i < size; i++){ var random_num = Math.floor(Math.random() * char.length); random_data = random_data + chars.substring(random_num,random_num+1); } return random_data; Really all I am doing is generating a chunk of bytes to send to the server, however, this is the only way I could find out how in Javascript. Any help would be appreciated.

Read the article
Need help making a div appear on the bottom of the screen while the rest of the divs scroll

- by user1896600

It's hard to describe my specific problem without just showing you the HTML code. The HTML source can be seen easily from clicking "View Source" while seeing the page http://techdot.us/projects/jeopardy/view.php. The CSS is located here: http://techdot.us/projects/jeopardy/style/gameStyle.css. My main goal is to have the main content table rows/columns appear on the majority of the screen (everything except 69px, to be exact). So, the bottom 69px contains an informational panel that is supposed to stay on the bottom of the screen, even when the user scrolls down the page. Scrolling is supposed to, in theory, trigger the majority of the content to go down the page normally, except the bottom bar which stays static. I have achieved this effect on the website. However, there is a big problem. On smaller screens (as you can simulate by resizing the window), some of the main table gets cut off. It seems that my CSS solution is a botch, and, in effect, does not accomplish my main goal. The bottom bar should not cut off part of the table from the main content div (gameTable), but the main content div should display all of its content in a scrollable fashion. My CSS at the moment works as long as the viewer's screen is a certain pixel height. This is definitely not permanent. Thank you SO much for the help. I really appreciate it and totally understand that I'm being a total pain by just throwing down tons of CSS and HTML code to edit.

Read the article
Database PK-FK design for future-effective-date entries?

- by Scott Balmos

Ultimately I'm going to convert this into a Hibernate/JPA design. But I wanted to start out from purely a database perspective. We have various tables containing data that is future-effective-dated. Take an employee table with the following pseudo-definition: employee id INT AUTO_INCREMENT ... data fields ... effectiveFrom DATE effectiveTo DATE employee_reviews id INT AUTO_INCREMENT employee_id INT FK employee.id Very simplistic. But let's say Employee A has id = 1, effectiveFrom = 1/1/2011, effectiveTo = 1/1/2099. That employee is going to be changing jobs in the future, which would in theory create a new row, id = 2 with effectiveFrom = 7/1/2011, effectiveTo = 1/1/2099, and id = 1's effectiveTo updated to 6/30/2011. But now, my program would have to go through any table that has a FK relationship to employee every night, and update those FK to reference the newly-effective employee entry. I have seen various postings in both pure SQL and Hibernate forums that I should have a separate employee_versions table, which is where I would have all effective-dated data stored, resulting in the updated pseudo-definition below: employee id INT AUTO_INCREMENT employee_versions id INT AUTO_INCREMENT employee_id INT FK employee.id ... data fields ... effectiveFrom DATE effectiveTo DATE employee_reviews id INT AUTO_INCREMENT employee_id INT FK employee.id Then to get any actual data, one would have to actually select from employee_versions with the proper employee_id and date range. This feels rather unnatural to have this secondary "versions" table for each versioned entity. Anyone have any opinions, suggestions from your own prior work, etc? Like I said, I'm taking this purely from a general SQL design standpoint first before layering in Hibernate on top. Thanks!

Read the article
Migrate from Oracle to MySQL

- by Cassy

Hi together. We ran into serious performance problems with our Oracle database and we would like to try to migrate to a MySQL-based database (either MySQL directly or, more preferrable, Infobright). The thing is, we need to let the old and the new system overlap for at least some weeks if not months, before we actually know, if all features of the new database match our needs. So, here is our situation: The Oracle database consists of multiple tables with each millions of rows. During the day, there are literally thousands of statements, which we cannot stop for migration. Every morning, new data is imported into the Oracle database, replacing some thousands of rows. Copying this process is not a problem, so we could, in theory, import in both databases in parallel. But, and here lies the challenge, for this to work, we need to have an export from the Oracle database with a consistent state from one day. (We cannot export some tables on Monday and some others on Tuesday, etc.) This means, that at least the export should be finished in less than one day. Our first thought was to dump the schema, but I wasn't able to find a tool to import an Oracle dump file into mysql. Exporting tables in CSV files might work, but I'm afraid it could take too long. So my question now is: What should I do? Is there any tool to import Oracle dump files into MySQL? Does anybody have any experience with such a large-scale migration? Thanks in advance, Cassy PS: Please, don't suggest performance optimization techniques for Oracle, we already tried a lot :-)

Read the article
Setting up a transparent SSL proxy

- by badunk

I've got a linux box set up with 2 network cards to inspect traffic going through port 80. One card is used to go out to the internet, the other one is hooked up to a networking switch. The point is to be able to inspect all HTTP and HTTPS traffic on devices hooked up to that switch for debugging purposes. I've written the following rules for iptables: nat -A PREROUTING -i eth1 -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.2.1:1337 -A PREROUTING -i eth1 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 1337 -A POSTROUTING -s 192.168.2.0/24 -o eth0 -j MASQUERADE On 192.168.2.1:1337, I've got a transparent http proxy using Charles (http://www.charlesproxy.com/) for recording. Everything's fine for port 80, but when I add similar rules for port 443 (SSL) pointing to port 1337, I get an error about invalid message through Charles. I've used SSL proxying on the same computer before with Charles (http://www.charlesproxy.com/documentation/proxying/ssl-proxying/), but have been unsuccessful with doing it transparently for some reason. Some resources I've googled say its not possible - I'm willing to accept that as an answer if someone can explain why. As a note, I have full access to the described set up including all the clients hooked up to the subnet - so I can accept self-signed certs by Charles. The solution doesn't have to be Charles-specific since in theory, any transparent proxy will do. Thanks! Edit: After playing with it a little, I was able to get it working for a specific host. When I modify my iptables to the following (and open 1338 in charles for reverse proxy): nat -A PREROUTING -i eth1 -p tcp -m tcp --dport 80 -j DNAT --to-destination 192.168.2.1:1337 -A PREROUTING -i eth1 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 1337 -A PREROUTING -i eth1 -p tcp -m tcp --dport 443 -j DNAT --to-destination 192.168.2.1:1338 -A PREROUTING -i eth1 -p tcp -m tcp --dport 443 -j REDIRECT --to-ports 1338 -A POSTROUTING -s 192.168.2.0/24 -o eth0 -j MASQUERADE I am able to get a response, but with no destination host. In the reverse proxy, if I just specify that everything from 1338 goes to a specific host that I wanted to hit, it performs the hand shake properly and I can turn on SSL proxying to inspect the communication. The setup is less than ideal because I don't want to assume everything from 1338 goes to that host - any idea why the destination host is being stripped? Thanks again

Read the article
OS choice between: Debian, gNewSense, and OpenSolaris

- by penyuan

I am planning to migrate from Mac OS X and Windows to either a Unix or Linux distribution, i.e. I am a Linux/Unix beginner. Right now the following caught my interest: Debian: Well established with huge repository of 20000+ apps. gNewSence: "Totally free" version of Ubuntu, so it should be more beginner friendly? OpenSolaris: Also open-source, and built on "strong" Unix base. I do mainly basic tasks such as web browsing, office work, maintaining big photo collection, and a little bit of programming. Questions: How "free" are each of these distributions compared to each other, is this whole freedom thing a big deal? Will a binary labeled as for Ubuntu work on gNewSense? What are simple IDEs for Debian and gNewSense?

Read the article

< Previous Page | 389 390 391 392 393 394 395 396 397 398 399 400 | Next Page >