Search Results

Search found 59975 results on 2399 pages for 'data comparison'.

Page 435/2399 | < Previous Page | 431 432 433 434 435 436 437 438 439 440 441 442  | Next Page >

  • What is the best algorithm for this array-comparison problem?

    - by mark
    What is the most efficient for speed algorithm to solve the following problem? Given 6 arrays, D1,D2,D3,D4,D5 and D6 each containing 6 numbers like: D1[0] = number D2[0] = number ...... D6[0] = number D1[1] = another number D2[1] = another number .... ..... .... ...... .... D1[5] = yet another number .... ...... .... Given a second array ST1, containing 1 number: ST1[0] = 6 Given a third array ans, containing 6 numbers: ans[0] = 3, ans[1] = 4, ans[2] = 5, ......ans[5] = 8 Using as index for the arrays D1,D2,D3,D4,D5 and D6, the number that goes from 0, to the number stored in ST1[0] minus one, in this example 6, so from 0 to 6-1, compare each res array against each D array My algorithm so far is: I tried to keep everything unlooped as much as possible. EML := ST1[0] //number contained in ST1[0] EML1 := 0 //start index for the arrays D While EML1 < EML if D1[ELM1] = ans[0] goto two if D2[ELM1] = ans[0] goto two if D3[ELM1] = ans[0] goto two if D4[ELM1] = ans[0] goto two if D5[ELM1] = ans[0] goto two if D6[ELM1] = ans[0] goto two ELM1 = ELM1 + 1 return 0 //If the ans[0] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers two: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[1] goto three if D2[ELM1] = ans[1] goto three if D3[ELM1] = ans[1] goto three if D4[ELM1] = ans[1] goto three if D5[ELM1] = ans[1] goto three if D6[ELM1] = ans[1] goto three ELM1 = ELM1 + 1 return 0 //If the ans[1] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers three: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[2] goto four if D2[ELM1] = ans[2] goto four if D3[ELM1] = ans[2] goto four if D4[ELM1] = ans[2] goto four if D5[ELM1] = ans[2] goto four if D6[ELM1] = ans[2] goto four ELM1 = ELM1 + 1 return 0 //If the ans[2] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers four: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[3] goto five if D2[ELM1] = ans[3] goto five if D3[ELM1] = ans[3] goto five if D4[ELM1] = ans[3] goto five if D5[ELM1] = ans[3] goto five if D6[ELM1] = ans[3] goto five ELM1 = ELM1 + 1 return 0 //If the ans[3] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers five: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[4] goto six if D2[ELM1] = ans[4] goto six if D3[ELM1] = ans[4] goto six if D4[ELM1] = ans[4] goto six if D5[ELM1] = ans[4] goto six if D6[ELM1] = ans[4] goto six ELM1 = ELM1 + 1 return 0 //If the ans[4] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers six: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[5] return 1 ////If the ans[1] number is not found in either D1[0-6]..... if D2[ELM1] = ans[5] return 1 which will then include ans[0-6] numbers return 1 if D3[ELM1] = ans[5] return 1 if D4[ELM1] = ans[5] return 1 if D5[ELM1] = ans[5] return 1 if D6[ELM1] = ans[5] return 1 ELM1 = ELM1 + 1 return 0 As language of choice, it would be pure c

    Read the article

  • How to troubleshoot a Highcharts script that's not rendering data when date is added and hanging the JS engine with large datasets?

    - by ylluminate
    I have a Highchart JS graph that I'm building in Rails (although I don't think Ruby has real bearing on this problem unless it's the Date output format) to which I'm adding the timestamp of each datapoint. Presently the array of floats is rendering fine without timestamps, however when I add the timestamp to the series it fails to rend. What's worse is that when the series has hundreds of entries all sorts of problems arise, not the least of which is the browser entirely hanging and requiring a force quit / kill. I'm using the following to build the array of arrays data series: series1 = readings.map{|row| [(row.date.to_i * 1000), (row.data1.to_f if BigDecimal(row.data1) != BigDecimal("-1000.0"))] } This yields a result like this: series: [{"name":"Data 1","data":[[1326262980000,1.79e-09],[1326262920000,1.29e-09],[1326262860000,1.22e-09],[1326262800000,1.42e-09],[1326262740000,1.29e-09],[1326262680000,1.34e-09],[1326262620000,1.31e-09],[1326262560000,1.51e-09],[1326262500000,1.24e-09],[1326262440000,1.7e-09],[1326262380000,1.24e-09],[1326262320000,1.29e-09],[1326262260000,1.53e-09],[1326262200000,1.23e-09],[1326262140000,1.21e-09]],"color":"blue"}] Yet nothing appears on the graph as noted. Notwithstanding, when I compare the data series in one of their very similar examples here: http://www.highcharts.com/demo/spline-irregular-time It appears that really the data series are formatted identically (except in mine I use the timestamp vs date method). This leads me to think I've got a problem with the timestamp output, but I'm just not able to figure out where / how as I'm converting the date output to an integer multipled by 1000 to convert it to milliseconds as per explained in a similar Railscasts tutorial. I would very much appreciate it if someone could point me in the right direction here as to what I may be doing wrong. What could cause no data to appear on the graph in smaller sized sets (<100 points) and when into the hundreds causes an apparent hang in the javascript engine in this case? Perhaps ultimately the key lies here as this is the entire js that's being generated and not rendering: jQuery(function() { // 1. Define JSON options var options = { chart: {"defaultSeriesType":"spline","renderTo":"chart_name"}, title: {"text":"Title"}, legend: {"layout":"vertical","style":{}}, xAxis: {"title":{"text":"UTC Time"},"type":"datetime"}, yAxis: [{"title":{"text":"Left Title","margin":10}},{"title":{"text":"Right Groups Title"},"opposite":true}], tooltip: {"enabled":true}, credits: {"enabled":false}, plotOptions: {"areaspline":{}}, series: [{"name":"Data 1","data":[[1326262980000,1.79e-08],[1326262920000,1.69e-08],[1326262860000,1.62e-08],[1326262800000,1.42e-08],[1326262740000,1.29e-08],[1326262680000,1.34e-08],[1326262620000,1.31e-08],[1326262560000,1.51e-08],[1326262500000,1.64e-08],[1326262440000,1.7e-08],[1326262380000,1.64e-08],[1326262320000,1.69e-08],[1326262260000,1.53e-08],[1326262200000,1.23e-08],[1326262140000,1.21e-08]],"color":"blue"},{"name":"Data 2","data":[[1326262980000,9.79e-09],[1326262920000,9.78e-09],[1326262860000,9.8e-09],[1326262800000,9.82e-09],[1326262740000,9.88e-09],[1326262680000,9.89e-09],[1326262620000,1.3e-06],[1326262560000,1.32e-06],[1326262500000,1.33e-06],[1326262440000,1.33e-06],[1326262380000,1.34e-06],[1326262320000,1.33e-06],[1326262260000,1.32e-06],[1326262200000,1.32e-06],[1326262140000,1.32e-06]],"color":"red"}], subtitle: {} }; // 2. Add callbacks (non-JSON compliant) // 3. Build the chart var chart = new Highcharts.StockChart(options); });

    Read the article

  • Handles Comparison: empty classes vs. undefined classes vs. void*

    - by Nawaz
    Microsoft's GDI+ defines many empty classes to be treated as handles internally. For example, (source GdiPlusGpStubs.h) //Approach 1 class GpGraphics {}; class GpBrush {}; class GpTexture : public GpBrush {}; class GpSolidFill : public GpBrush {}; class GpLineGradient : public GpBrush {}; class GpPathGradient : public GpBrush {}; class GpHatch : public GpBrush {}; class GpPen {}; class GpCustomLineCap {}; There are other two ways to define handles. They're, //Approach 2 class BOOK; //no need to define it! typedef BOOK *PBOOK; typedef PBOOK HBOOK; //handle to be used internally //Approach 3 typedef void* PVOID; typedef PVOID HBOOK; //handle to be used internally I just want to know the advantages and disadvantages of each of these approaches. One advantage with Microsoft's approach is that, they can define type-safe hierarchy of handles using empty classes, which (I think) is not possible with the other two approaches. What else? EDIT: One advantage with the second approach (i.e using incomplete classes) is that we can prevent clients from dereferencing the handles (that means, this approach appears to support encapsulation strongly, I suppose). The code would not even compile if one attempts to dereference handles. What else?

    Read the article

  • Why date comparison in sql is not working. Please help

    - by Shantanu Gupta
    I am trying to fetch some records from table but when i use OR instead of AND it returns me few records but not in other case. dates given exactly are present in table. What mistake i am doing ? select newsid,title,detail,hotnews from view_newsmaster where datefrom>=CONVERT(datetime, '4-22-2010',111) AND dateto<=CONVERT(datetime, '4-22-2010',111)

    Read the article

  • Which term to use when referring to functional data structures: persistent or immutable?

    - by Bob
    In the context of functional programming which is the correct term to use: persistent or immutable? When I Google "immutable data structures" I get a Wikipedia link to an article on "Persistent data structure" which even goes on to say: such data structures are effectively immutable Which further confuses things for me. Do functional programs rely on persistent data structures or immutable data structures? Or are they always the same thing?

    Read the article

  • Problem with receiving data form serial port in c#?

    - by moon
    hello i have problem with receiving data from serial port in c# in am inserting a new line operator at the end of data buffer. then i send this data buffer on serial port, after this my c# GUI receiver will take this data via Readline() function but it always give me raw data not the actual one how to resolve this problem.

    Read the article

  • Speed comparison - Template specialization vs. Virtual Function vs. If-Statement

    - by Person
    Just to get it out of the way... Premature optimization is the root of all evil Make use of OOP etc. I understand. Just looking for some advice regarding the speed of certain operations that I can store in my grey matter for future reference. Say you have an Animation class. An animation can be looped (plays over and over) or not looped (plays once), it may have unique frame times or not, etc. Let's say there are 3 of these "either or" attributes. Note that any method of the Animation class will at most check for one of these (i.e. this isn't a case of a giant branch of if-elseif). Here are some options. 1) Give it boolean members for the attributes given above, and use an if statement to check against them when playing the animation to perform the appropriate action. Problem: Conditional checked every single time the animation is played. 2) Make a base animation class, and derive other animations classes such as LoopedAnimation and AnimationUniqueFrames, etc. Problem: Vtable check upon every call to play the animation given that you have something like a vector<Animation>. Also, making a separate class for all of the possible combinations seems code bloaty. 3) Use template specialization, and specialize those functions that depend on those attributes. Like template<bool looped, bool uniqueFrameTimes> class Animation. Problem: The problem with this is that you couldn't just have a vector<Animation> for something's animations. Could also be bloaty. I'm wondering what kind of speed each of these options offer? I'm particularly interested in the 1st and 2nd option because the 3rd doesn't allow one to iterate through a general container of Animations. In short, what is faster - a vtable fetch or a conditional?

    Read the article

  • Could this C cast to avoid a signed/unsigned comparison make any sense?

    - by sharptooth
    I'm reviewing a C++ project and see effectively the following: std::vector<SomeType> objects; //then later int size = (int)objects.size(); for( int i = 0; i < size; ++i ) { process( objects[i] ); } Here's what I see. std::vector::size() returns size_t that can be of some size not related to the size of int. Even if sizeof(int) == sizeof(size_t) int is signed and can't hold all possible values of size_t. So the code above could only process the lower part of a very long vector and contains a bug. That said I'm curious of why the author might have written this? My only guess is that first he omitted the (int) cast and the compiler emitted something like Visual C++ C4018 warning: warning C4018: '<' : signed/unsigned mismatch so the author though that the best way to avoid the compiler warning would be to simply cast the size_t to int thus making the compiler shut up. Is there any other possible sane reason for that C cast?

    Read the article

  • Indexing with pointer C/C++

    - by Leavenotrace
    Hey I'm trying to write a program to carry out newtons method and find the roots of the equation exp(-x)-(x^2)+3. It works in so far as finding the root, but I also want it to print out the root after each iteration but I can't get it to work, Could anyone point out my mistake I think its something to do with my indexing? Thanks a million :) #include <stdio.h> #include <math.h> #include <malloc.h> //Define Functions: double evalf(double x) { double answer=exp(-x)-(x*x)+3; return(answer); } double evalfprime(double x) { double answer=-exp(-x)-2*x; return(answer); } double *newton(double initialrt,double accuracy,double *data) { double root[102]; data=root; int maxit = 0; root[0] = initialrt; for (int i=1;i<102;i++) { *(data+i)=*(data+i-1)-evalf(*(data+i-1))/evalfprime(*(data+i-1)); if(fabs(*(data+i)-*(data+i-1))<accuracy) { maxit=i; break; } maxit=i; } if((maxit+1==102)&&(fabs(*(data+maxit)-*(data+maxit-1))>accuracy)) { printf("\nMax iteration reached, method terminated"); } else { printf("\nMethod successful"); printf("\nNumber of iterations: %d\nRoot Estimate: %lf\n",maxit+1,*(data+maxit)); } return(data); } int main() { double root,accuracy; double *data=(double*)malloc(sizeof(double)*102); printf("NEWTONS METHOD PROGRAMME:\nEquation: f(x)=exp(-x)-x^2+3=0\nMax No iterations=100\n\nEnter initial root estimate\n>> "); scanf("%lf",&root); _flushall(); printf("\nEnter accuracy required:\n>>"); scanf("%lf",&accuracy); *data= *newton(root,accuracy,data); printf("Iteration Root Error\n "); printf("%d %lf \n", 0,*(data)); for(int i=1;i<102;i++) { printf("%d %5.5lf %5.5lf\n", i,*(data+i),*(data+i)-*(data+i-1)); if(*(data+i*sizeof(double))-*(data+i*sizeof(double)-1)==0) { break; } } getchar(); getchar(); free(data); return(0); }

    Read the article

  • How do I use a file grep comparison inside a bash if/else statement?

    - by openid_kenja
    When our server comes up we need to check a file to see how the server is configured. We want to search for the following string inside our /etc/aws/hosts.conf file: MYSQL_ROLE=master Then, we want to test whether that string exists and use an if/else statement to run one of two options depending on whether the string exists or not. What is the BASH syntax for the if statement? if [ ????? ]; then #do one thing else #do another thing fi

    Read the article

  • C++: How to make comparison function for char arrays?

    - by Newbie
    Is this possible? i get weird error message when i put char as the type: inline bool operator==(const char *str1, const char *str2){ // ... } Error message: error C2803: 'operator ==' must have at least one formal parameter of class type ... which i dont understand at all. I was thinking if i could directly compare stuff like: const char *str1 = "something"; const char *str2 = "something else"; const char str3[] = "lol"; // not sure if this is same as above and then compare: if(str1 == str2){ // ... } etc. But i also want it to work with: char *str = new char[100]; and: char *str = (char *)malloc(100); I am assuming every char array i use this way would end in NULL character, so the checking should be possible, but i understand it can be unsafe etc. I just want to know if this is possible to do, and how.

    Read the article

  • How to use an excel data-set for a multi-line ggplot in R?

    - by user1299887
    I have a data set in excel that I am trying to create a multiple line plot with on R. The data set contains 7 food groups and the calories consumed daily associated to the groups. As well, there is that set of data over 38 years (from 1970-2008) and I am attempting to use this data set to create a multiple line plot on R. I have tried for hours on end but can not seem to get R to recognize the variables within the data set.

    Read the article

  • Configuring Multiple Instances of MySQL in Solaris 11

    - by rajeshr
    Recently someone asked me for steps to configure multiple instances of MySQL database in an Operating Platform. Coz of my familiarity with Solaris OE, I prepared some notes on configuring multiple instances of MySQL database on Solaris 11. Maybe it's useful for some: If you want to run Solaris Operating System (or any other OS of your choice) as a virtualized instance in desktop, consider using Virtual Box. To download Solaris Operating System, click here. Once you have your Solaris Operating System (Version 11) up and running and have Internet connectivity to gain access to the Image Packaging System (IPS), please follow the steps as mentioned below to install MySQL and configure multiple instances: 1. Install MySQL Database in Solaris 11 $ sudo pkg install mysql-51 2. Verify if the mysql is installed: $ svcs -a | grep mysql Note: Service FMRI will look similar to the one here: svc:/application/database/mysql:version_51 3. Prepare data file system for MySQL Instance 1 zfs create rpool/mysql zfs create rpool/mysql/data zfs set mountpoint=/mysql/data rpool/mysql/data 4. Prepare data file system for MySQL Instance 2 zfs create rpool/mysql/data2 zfs set mountpoint=/mysql/data rpool/mysql/data2 5. Change the mysql/datadir of the MySQL Service (SMF) to point to /mysql/data $ svcprop mysql:version_51 | grep mysql/data $ svccfg -s mysql:version_51 setprop mysql/data=/mysql/data 6. Create a new instance of MySQL 5.1 (a) Copy the manifest of the default instance to temporary directory: $ sudo cp /lib/svc/manifest/application/database/mysql_51.xml /var/tmp/mysql_51_2.xml (b) Make appropriate modifications on the XML file $ sudo vi /var/tmp/mysql_51_2.xml - Change the "instance name" section to a new value "version_51_2" - Change the value of property name "data" to point to the ZFS file system "/mysql/data2" 7. Import the manifest to the SMF repository: $ sudo svccfg import /var/tmp/mysql_51_2.xml 8. Before starting the service, copy the file /etc/mysql/my.cnf to the data directories /mysql/data & /mysql/data2. $ sudo cp /etc/mysql/my.cnf /mysql/data/ $ sudo cp /etc/mysql/my.cnf /mysql/data2/ 9. Make modifications to the my.cnf in each of the data directories as required: $ sudo vi /mysql/data/my.cnf Under the [client] section port=3306 socket=/tmp/mysql.sock ---- ---- Under the [mysqld] section port=3306 socket=/tmp/mysql.sock datadir=/mysql/data ----- ----- server-id=1 $ sudo vi /mysql/data2/my.cnf Under the [client] section port=3307 socket=/tmp/mysql2.sock ----- ----- Under the [mysqld] section port=3307 socket=/tmp/mysql2.sock datadir=/mysql/data2 ----- ----- server-id=2 10. Make appropriate modification to the startup script of MySQL (managed by SMF) to point to the appropriate my.cnf for each instance: $ sudo vi /lib/svc/method/mysql_51 Note: Search for all occurences of mysqld_safe command and modify it to include the --defaults-file option. An example entry would look as follows: ${MySQLBIN}/mysqld_safe --defaults-file=${MYSQLDATA}/my.cnf --user=mysql --datadir=${MYSQLDATA} --pid=file=${PIDFILE} 11. Start the service: $ sudo svcadm enable mysql:version_51_2 $ sudo svcadm enable mysql:version_51 12. Verify that the two services are running by using: $ svcs mysql 13. Verify the processes: $ ps -ef | grep mysqld 14. Connect to each mysqld instance and verify: $ mysql --defaults-file=/mysql/data/my.cnf -u root -p $ mysql --defaults-file=/mysql/data2/my.cnf -u root -p Some references for Solaris 11 newbies Taking your first steps with Solaris 11 Introducing the basics of Image Packaging System Service Management Facility How To Guide For a detailed list of official educational modules available on Solaris 11, please visit here For MySQL courses from Oracle University access this page.

    Read the article

  • How would you gather client's data on Google App Engine without using Datastore/Backend Instances too much?

    - by ruslan
    I'm relatively new to StackExchange and not sure if it's appropriate place to ask design question. Site gives me a hint "The question you're asking appears subjective and is likely to be closed". Please let me know. Anyway.. One of the projects I'm working on is online survey engine. It's my first big commercial project on Google App Engine. I need your advice on how to collect stats and efficiently record them in DataStore without bankrupting me. Initial requirements are: After user finishes survey client sends list of pairs [ID (int) + PercentHit (double)]. This list shows how close answers of this user match predefined answers of reference answerers (which identified by IDs). I call them "target IDs". Creator of the survey wants to see aggregated % for given IDs for last hour, particular timeframe or from the beginning of the survey. Some surveys may have thousands of target/reference answerers. So I created entity public class HitsStatsDO implements Serializable { @Id transient private Long id; transient private Long version = (long) 0; transient private Long startDate; @Parent transient private Key parent; // fake parent which contains target id @Transient int targetId; private double avgPercent; private long hitCount; } But writing HitsStatsDO for each target from each user would give a lot of data. For instance I had a survey with 3000 targets which was answered by ~4 million people within one week with 300K people taking survey in first day. Even if we assume they were answering it evenly for 24 hours it would give us ~1040 writes/second. Obviously it hits concurrent writes limit of Datastore. I decided I'll collect data for one hour and save that, that's why there are avgPercent and hitCount in HitsStatsDO. GAE instances are stateless so I had to use dynamic backend instance. There I have something like this: // Contains stats for one hour private class Shard { ReadWriteLock lock = new ReentrantReadWriteLock(); Map<Integer, HitsStatsDO> map = new HashMap<Integer, HitsStatsDO>(); // Key is target ID public void saveToDatastore(); public void updateStats(Long startDate, Map<Integer, Double> hits); } and map with shard for current hour and previous hour (which doesn't stay here for long) private HashMap<Long, Shard> shards = new HashMap<Long, Shard>(); // Key is HitsStatsDO.startDate So once per hour I dump Shard for previous hour to Datastore. Plus I have class LifetimeStats which keeps Map<Integer, HitsStatsDO> in memcached where map-key is target ID. Also in my backend shutdown hook method I dump stats for unfinished hour to Datastore. There is only one major issue here - I have only ONE backend instance :) It raises following questions on which I'd like to hear your opinion: Can I do this without using backend instance ? What if one instance is not enough ? How can I split data between multiple dynamic backend instances? It hard because I don't know how many I have because Google creates new one as load increases. I know I can launch exact number of resident backend instances. But how many ? 2, 5, 10 ? What if I have no load at all for a week. Constantly running 10 backend instances is too expensive. What do I do with data from clients while backend instance is dead/restarting? Thank you very much in advance for your thoughts.

    Read the article

  • How can I gather client's data on Google App Engine without using Datastore/Backend Instances too much?

    - by ruslan
    One of the projects I'm working on is online survey engine. It's my first big commercial project on Google App Engine. I need your advice on how to collect stats and efficiently record them in DataStore without bankrupting me. Initial requirements are: After user finishes survey client sends list of pairs [ID (int) + PercentHit (double)]. This list shows how close answers of this user match predefined answers of reference answerers (which identified by IDs). I call them "target IDs". Creator of the survey wants to see aggregated % for given IDs for last hour, particular timeframe or from the beginning of the survey. Some surveys may have thousands of target/reference answerers. So I created entity public class HitsStatsDO implements Serializable { @Id transient private Long id; transient private Long version = (long) 0; transient private Long startDate; @Parent transient private Key parent; // fake parent which contains target id @Transient int targetId; private double avgPercent; private long hitCount; } But writing HitsStatsDO for each target from each user would give a lot of data. For instance I had a survey with 3000 targets which was answered by ~4 million people within one week with 300K people taking survey in first day. Even if we assume they were answering it evenly for 24 hours it would give us ~1040 writes/second. Obviously it hits concurrent writes limit of Datastore. I decided I'll collect data for one hour and save that, that's why there are avgPercent and hitCount in HitsStatsDO. GAE instances are stateless so I had to use dynamic backend instance. There I have something like this: // Contains stats for one hour private class Shard { ReadWriteLock lock = new ReentrantReadWriteLock(); Map<Integer, HitsStatsDO> map = new HashMap<Integer, HitsStatsDO>(); // Key is target ID public void saveToDatastore(); public void updateStats(Long startDate, Map<Integer, Double> hits); } and map with shard for current hour and previous hour (which doesn't stay here for long) private HashMap<Long, Shard> shards = new HashMap<Long, Shard>(); // Key is HitsStatsDO.startDate So once per hour I dump Shard for previous hour to Datastore. Plus I have class LifetimeStats which keeps Map<Integer, HitsStatsDO> in memcached where map-key is target ID. Also in my backend shutdown hook method I dump stats for unfinished hour to Datastore. There is only one major issue here - I have only ONE backend instance :) It raises following questions on which I'd like to hear your opinion: Can I do this without using backend instance ? What if one instance is not enough ? How can I split data between multiple dynamic backend instances? It hard because I don't know how many I have because Google creates new one as load increases. I know I can launch exact number of resident backend instances. But how many ? 2, 5, 10 ? What if I have no load at all for a week. Constantly running 10 backend instances is too expensive. What do I do with data from clients while backend instance is dead/restarting?

    Read the article

  • Are there any Microsoft Exchange Clients for iOS and Android that store their local data in an encrypted manner?

    - by Zac B
    I don't feel like this is a product recommendation question, more of a "does this tech even exist and is it feasible" question, but if I'm wrong, feel free to give this question the boot. Context: Our company has a bunch of traveling employees who access the company's Exchange server via thier iDevices or android phones, but because of the data protection laws in the state where our company is based (and the nature of the data our company works with), a recent security audit found that all mobile devices (laptops, phones, etc) operated by our company need to have all company correspondence and related data encrypted all the time. For laptops, that was easy: BitLocker or TrueCrypt, problem solved. For phones and tablets, however, I'm stumped. Sure, you can put lock screens/passwords on the phones, but the data is still accessible via external extraction, as law enforcement authorities already know. Question: Are there any clients for Microsoft Exchange that run on iOS or Android which store local data encrypted? The people using our mobile devices do a lot of their work while offline, so just giving them OWA access with SSL connection security isn't enough. Are there apps/technologies that present an additional login credential prompt to decrypt locally stored data in the app's storage area on the phone? My gut reaction when I started looking into this was "that doesn't sound like something Apple would allow into the App Store", but I've been wrong before...

    Read the article

  • Security implications of adding www-data to /etc/sudoers to run php-cgi as a different user

    - by BMiner
    What I really want to do is allow the 'www-data' user to have the ability to launch php-cgi as another user. I just want to make sure that I fully understand the security implications. The server should support a shared hosting environment where various (possibly untrusted) users have chroot'ed FTP access to the server to store their HTML and PHP files. Then, since PHP scripts can be malicious and read/write others' files, I'd like to ensure that each users' PHP scripts run with the same user permissions for that user (instead of running as www-data). Long story short, I have added the following line to my /etc/sudoers file, and I wanted to run it past the community as a sanity check: www-data ALL = (%www-data) NOPASSWD: /usr/bin/php-cgi This line should only allow www-data to run a command like this (without a password prompt): sudo -u some_user /usr/bin/php-cgi ...where some_user is a user in the group www-data. What are the security implications of this? This should then allow me to modify my Lighttpd configuration like this: fastcgi.server += ( ".php" => (( "bin-path" => "sudo -u some_user /usr/bin/php-cgi", "socket" => "/tmp/php.socket", "max-procs" => 1, "bin-environment" => ( "PHP_FCGI_CHILDREN" => "4", "PHP_FCGI_MAX_REQUESTS" => "10000" ), "bin-copy-environment" => ( "PATH", "SHELL", "USER" ), "broken-scriptfilename" => "enable" )) ) ...allowing me to spawn new FastCGI server instances for each user.

    Read the article

  • Beware Sneaky Reads with Unique Indexes

    - by Paul White NZ
    A few days ago, Sandra Mueller (twitter | blog) asked a question using twitter’s #sqlhelp hash tag: “Might SQL Server retrieve (out-of-row) LOB data from a table, even if the column isn’t referenced in the query?” Leaving aside trivial cases (like selecting a computed column that does reference the LOB data), one might be tempted to say that no, SQL Server does not read data you haven’t asked for.  In general, that’s quite correct; however there are cases where SQL Server might sneakily retrieve a LOB column… Example Table Here’s a T-SQL script to create that table and populate it with 1,000 rows: CREATE TABLE dbo.LOBtest ( pk INTEGER IDENTITY NOT NULL, some_value INTEGER NULL, lob_data VARCHAR(MAX) NULL, another_column CHAR(5) NULL, CONSTRAINT [PK dbo.LOBtest pk] PRIMARY KEY CLUSTERED (pk ASC) ); GO DECLARE @Data VARCHAR(MAX); SET @Data = REPLICATE(CONVERT(VARCHAR(MAX), 'x'), 65540);   WITH Numbers (n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2 ) INSERT LOBtest WITH (TABLOCKX) ( some_value, lob_data ) SELECT TOP (1000) N.n, @Data FROM Numbers N WHERE N.n <= 1000; Test 1: A Simple Update Let’s run a query to subtract one from every value in the some_value column: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; As you might expect, modifying this integer column in 1,000 rows doesn’t take very long, or use many resources.  The STATITICS IO and TIME output shows a total of 9 logical reads, and 25ms elapsed time.  The query plan is also very simple: Looking at the Clustered Index Scan, we can see that SQL Server only retrieves the pk and some_value columns during the scan: The pk column is needed by the Clustered Index Update operator to uniquely identify the row that is being changed.  The some_value column is used by the Compute Scalar to calculate the new value.  (In case you are wondering what the Top operator is for, it is used to enforce SET ROWCOUNT). Test 2: Simple Update with an Index Now let’s create a nonclustered index keyed on the some_value column, with lob_data as an included column: CREATE NONCLUSTERED INDEX [IX dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest (some_value) INCLUDE ( lob_data ) WITH ( FILLFACTOR = 100, MAXDOP = 1, SORT_IN_TEMPDB = ON ); This is not a useful index for our simple update query; imagine that someone else created it for a different purpose.  Let’s run our update query again: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; We find that it now requires 4,014 logical reads and the elapsed query time has increased to around 100ms.  The extra logical reads (4 per row) are an expected consequence of maintaining the nonclustered index. The query plan is very similar to before (click to enlarge): The Clustered Index Update operator picks up the extra work of maintaining the nonclustered index. The new Compute Scalar operators detect whether the value in the some_value column has actually been changed by the update.  SQL Server may be able to skip maintaining the nonclustered index if the value hasn’t changed (see my previous post on non-updating updates for details).  Our simple query does change the value of some_data in every row, so this optimization doesn’t add any value in this specific case. The output list of columns from the Clustered Index Scan hasn’t changed from the one shown previously: SQL Server still just reads the pk and some_data columns.  Cool. Overall then, adding the nonclustered index hasn’t had any startling effects, and the LOB column data still isn’t being read from the table.  Let’s see what happens if we make the nonclustered index unique. Test 3: Simple Update with a Unique Index Here’s the script to create a new unique index, and drop the old one: CREATE UNIQUE NONCLUSTERED INDEX [UQ dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest (some_value) INCLUDE ( lob_data ) WITH ( FILLFACTOR = 100, MAXDOP = 1, SORT_IN_TEMPDB = ON ); GO DROP INDEX [IX dbo.LOBtest some_value (lob_data)] ON dbo.LOBtest; Remember that SQL Server only enforces uniqueness on index keys (the some_data column).  The lob_data column is simply stored at the leaf-level of the non-clustered index.  With that in mind, we might expect this change to make very little difference.  Let’s see: UPDATE dbo.LOBtest WITH (TABLOCKX) SET some_value = some_value - 1; Whoa!  Now look at the elapsed time and logical reads: Scan count 1, logical reads 2016, physical reads 0, read-ahead reads 0, lob logical reads 36015, lob physical reads 0, lob read-ahead reads 15992.   CPU time = 172 ms, elapsed time = 16172 ms. Even with all the data and index pages in memory, the query took over 16 seconds to update just 1,000 rows, performing over 52,000 LOB logical reads (nearly 16,000 of those using read-ahead). Why on earth is SQL Server reading LOB data in a query that only updates a single integer column? The Query Plan The query plan for test 3 looks a bit more complex than before: In fact, the bottom level is exactly the same as we saw with the non-unique index.  The top level has heaps of new stuff though, which I’ll come to in a moment. You might be expecting to find that the Clustered Index Scan is now reading the lob_data column (for some reason).  After all, we need to explain where all the LOB logical reads are coming from.  Sadly, when we look at the properties of the Clustered Index Scan, we see exactly the same as before: SQL Server is still only reading the pk and some_value columns – so what’s doing the LOB reads? Updates that Sneakily Read Data We have to go as far as the Clustered Index Update operator before we see LOB data in the output list: [Expr1020] is a bit flag added by an earlier Compute Scalar.  It is set true if the some_value column has not been changed (part of the non-updating updates optimization I mentioned earlier). The Clustered Index Update operator adds two new columns: the lob_data column, and some_value_OLD.  The some_value_OLD column, as the name suggests, is the pre-update value of the some_value column.  At this point, the clustered index has already been updated with the new value, but we haven’t touched the nonclustered index yet. An interesting observation here is that the Clustered Index Update operator can read a column into the data flow as part of its update operation.  SQL Server could have read the LOB data as part of the initial Clustered Index Scan, but that would mean carrying the data through all the operations that occur prior to the Clustered Index Update.  The server knows it will have to go back to the clustered index row to update it, so it delays reading the LOB data until then.  Sneaky! Why the LOB Data Is Needed This is all very interesting (I hope), but why is SQL Server reading the LOB data?  For that matter, why does it need to pass the pre-update value of the some_value column out of the Clustered Index Update? The answer relates to the top row of the query plan for test 3.  I’ll reproduce it here for convenience: Notice that this is a wide (per-index) update plan.  SQL Server used a narrow (per-row) update plan in test 2, where the Clustered Index Update took care of maintaining the nonclustered index too.  I’ll talk more about this difference shortly. The Split/Sort/Collapse combination is an optimization, which aims to make per-index update plans more efficient.  It does this by breaking each update into a delete/insert pair, reordering the operations, removing any redundant operations, and finally applying the net effect of all the changes to the nonclustered index. Imagine we had a unique index which currently holds three rows with the values 1, 2, and 3.  If we run a query that adds 1 to each row value, we would end up with values 2, 3, and 4.  The net effect of all the changes is the same as if we simply deleted the value 1, and added a new value 4. By applying net changes, SQL Server can also avoid false unique-key violations.  If we tried to immediately update the value 1 to a 2, it would conflict with the existing value 2 (which would soon be updated to 3 of course) and the query would fail.  You might argue that SQL Server could avoid the uniqueness violation by starting with the highest value (3) and working down.  That’s fine, but it’s not possible to generalize this logic to work with every possible update query. SQL Server has to use a wide update plan if it sees any risk of false uniqueness violations.  It’s worth noting that the logic SQL Server uses to detect whether these violations are possible has definite limits.  As a result, you will often receive a wide update plan, even when you can see that no violations are possible. Another benefit of this optimization is that it includes a sort on the index key as part of its work.  Processing the index changes in index key order promotes sequential I/O against the nonclustered index. A side-effect of all this is that the net changes might include one or more inserts.  In order to insert a new row in the index, SQL Server obviously needs all the columns – the key column and the included LOB column.  This is the reason SQL Server reads the LOB data as part of the Clustered Index Update. In addition, the some_value_OLD column is required by the Split operator (it turns updates into delete/insert pairs).  In order to generate the correct index key delete operation, it needs the old key value. The irony is that in this case the Split/Sort/Collapse optimization is anything but.  Reading all that LOB data is extremely expensive, so it is sad that the current version of SQL Server has no way to avoid it. Finally, for completeness, I should mention that the Filter operator is there to filter out the non-updating updates. Beating the Set-Based Update with a Cursor One situation where SQL Server can see that false unique-key violations aren’t possible is where it can guarantee that only one row is being updated.  Armed with this knowledge, we can write a cursor (or the WHILE-loop equivalent) that updates one row at a time, and so avoids reading the LOB data: SET NOCOUNT ON; SET STATISTICS XML, IO, TIME OFF;   DECLARE @PK INTEGER, @StartTime DATETIME; SET @StartTime = GETUTCDATE();   DECLARE curUpdate CURSOR LOCAL FORWARD_ONLY KEYSET SCROLL_LOCKS FOR SELECT L.pk FROM LOBtest L ORDER BY L.pk ASC;   OPEN curUpdate;   WHILE (1 = 1) BEGIN FETCH NEXT FROM curUpdate INTO @PK;   IF @@FETCH_STATUS = -1 BREAK; IF @@FETCH_STATUS = -2 CONTINUE;   UPDATE dbo.LOBtest SET some_value = some_value - 1 WHERE CURRENT OF curUpdate; END;   CLOSE curUpdate; DEALLOCATE curUpdate;   SELECT DATEDIFF(MILLISECOND, @StartTime, GETUTCDATE()); That completes the update in 1280 milliseconds (remember test 3 took over 16 seconds!) I used the WHERE CURRENT OF syntax there and a KEYSET cursor, just for the fun of it.  One could just as well use a WHERE clause that specified the primary key value instead. Clustered Indexes A clustered index is the ultimate index with included columns: all non-key columns are included columns in a clustered index.  Let’s re-create the test table and data with an updatable primary key, and without any non-clustered indexes: IF OBJECT_ID(N'dbo.LOBtest', N'U') IS NOT NULL DROP TABLE dbo.LOBtest; GO CREATE TABLE dbo.LOBtest ( pk INTEGER NOT NULL, some_value INTEGER NULL, lob_data VARCHAR(MAX) NULL, another_column CHAR(5) NULL, CONSTRAINT [PK dbo.LOBtest pk] PRIMARY KEY CLUSTERED (pk ASC) ); GO DECLARE @Data VARCHAR(MAX); SET @Data = REPLICATE(CONVERT(VARCHAR(MAX), 'x'), 65540);   WITH Numbers (n) AS ( SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM master.sys.columns C1, master.sys.columns C2 ) INSERT LOBtest WITH (TABLOCKX) ( pk, some_value, lob_data ) SELECT TOP (1000) N.n, N.n, @Data FROM Numbers N WHERE N.n <= 1000; Now here’s a query to modify the cluster keys: UPDATE dbo.LOBtest SET pk = pk + 1; The query plan is: As you can see, the Split/Sort/Collapse optimization is present, and we also gain an Eager Table Spool, for Halloween protection.  In addition, SQL Server now has no choice but to read the LOB data in the Clustered Index Scan: The performance is not great, as you might expect (even though there is no non-clustered index to maintain): Table 'LOBtest'. Scan count 1, logical reads 2011, physical reads 0, read-ahead reads 0, lob logical reads 36015, lob physical reads 0, lob read-ahead reads 15992.   Table 'Worktable'. Scan count 1, logical reads 2040, physical reads 0, read-ahead reads 0, lob logical reads 34000, lob physical reads 0, lob read-ahead reads 8000.   SQL Server Execution Times: CPU time = 483 ms, elapsed time = 17884 ms. Notice how the LOB data is read twice: once from the Clustered Index Scan, and again from the work table in tempdb used by the Eager Spool. If you try the same test with a non-unique clustered index (rather than a primary key), you’ll get a much more efficient plan that just passes the cluster key (including uniqueifier) around (no LOB data or other non-key columns): A unique non-clustered index (on a heap) works well too: Both those queries complete in a few tens of milliseconds, with no LOB reads, and just a few thousand logical reads.  (In fact the heap is rather more efficient). There are lots more fun combinations to try that I don’t have space for here. Final Thoughts The behaviour shown in this post is not limited to LOB data by any means.  If the conditions are met, any unique index that has included columns can produce similar behaviour – something to bear in mind when adding large INCLUDE columns to achieve covering queries, perhaps. Paul White Email: [email protected] Twitter: @PaulWhiteNZ

    Read the article

  • Thank You for a Great Welcome for Oracle GoldenGate 11g Release 2

    - by Irem Radzik
    Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} Yesterday morning we had two launch webcasts for Oracle GoldenGate 11g Release 2. I had the pleasure to present, as well as moderate the Q&A panels in both of these webcasts. Both events had hundreds of live attendees, sending us over 150 questions. Even though we left 30 minutes for Q&A, it was not nearly enough time to address for all the insightful questions our audience sent. Our product management team and I really appreciate the interaction we had yesterday and we are starting to respond back with outstanding questions today. Oracle GoldenGate’s new release launch also had great welcome from the media. You can find the links for various articles on the new release below: ITBusinessEdge Oracle Embraces Cross-Platform Data Integration Information Week: Oracle Real-Time Advance Taps Compressed Data Integration Developer News, Oracle GoldenGate Adds Deeper Oracle Integration, Extends Real-Time Performance CIO, Oracle GoldenGate Buddies Up with Sibling Software DBTA, Real-Time Data Integration: Oracle GoldenGate 11g Release 2 Now Available CBR Oracle unveils GoldenGate 11g Release 2 real-time data integration application In this blog, I want to address some of the frequently asked questions that came up during the webcasts. You can find the top questions and their answers along with related resources below. We will continue to address frequently asked questions via future blogs. Q: Will the new Integrated Capture for Oracle Database replace the Classic Capture? If not, which one do I use when? A: No, Classic Capture will be around for long time. Core platform specific features, bug fixes, and patches will be available for both Capture processes.Oracle Database specific features will be only available in the Integrated Capture. The Integrated Capture for Oracle Database is an option for users that need to capture data from compressed tables or need support for XML data types, XA on RAC. Users who don’t leverage these features should continue to use our Classic Capture. For more information on Oracle GoldenGate 11g Release 2 I recommend to check out the White paper: Oracle GoldenGate 11gR2 New Features as well as other technical white papers we have on OTN.                                                         For those of you coming to OpenWorld, please attend the related session: Extracting Data in Oracle GoldenGate Integrated Capture Mode, Monday Oct 1st 1:45pm Moscone South – 102 to learn more about this new feature. Q: What is new in Conflict Detection and Resolution? And how does it work? A: There are now pre-built functions to identify the conditions under which an error occurs and how to handle the record when the condition occurs. Error conditions handled include inserts into a target table where the row already exists, updates or deletes to target table rows that exist, but the original source data (before columns) do not match the existing data in the target row, and updates or deletes where the row does not exist in the target database table.Foreach of these conditions a method to handle the error is specified.  Please check out our recent blog on this topic and the White paper: Oracle GoldenGate 11gR2 New Features white paper.  Also, for those attending OpenWorld please attend the session: Best Practices for Conflict Detection and Resolution in Oracle GoldenGate for Active/Active-  Wednesday Oct 3rd  3:30pm Mascone 3000 Q: Does Oracle GoldenGate Veridata and the Management Pack require additional licenses, or is it incorporated with the GoldenGate license? A: Oracle GoldenGate Veridata and Oracle Management Pack for Oracle GoldenGate are additional products and require separate licenses. Please check out Oracle's price list here. Q: Does GoldenGate - Oracle Enterprise Manager Plug-in require additional license? A: Oracle Enterprise Manager Plug-in is included in the Oracle Management Pack for Oracle GoldenGate license, which is separate from Oracle GoldenGate license. There is no separate license for the Enterprise Manager Plug-in by itself. Oracle GoldenGate Monitor, Oracle GoldenGate Director, and Enterprise Manager Plug-in are included in the Management Pack for Oracle GoldenGate license. Please check out Management Pack for Oracle GoldenGate data sheet for more info on this product bundle. Q: Is Oracle GoldenGate replacing Oracle Streams product? A: Oracle GoldenGate is the strategic data replication product. Therefore, Oracle Streams will continue to be supported, but will not be actively enhanced. Rather, the best elements of Oracle Streams will be added to Oracle GoldenGate. Conflict management is one of them and with the latest release Oracle GoldenGate has a more advanced conflict management offering. Current customers depending on Oracle Streams will continue to be fully supported. Q: How is Oracle GoldenGate different than Oracle Data Integrator? A: Oracle Data Integrator is designed for fast bulk data movement and transformation between heterogeneous systems, while GoldenGate is designed for real-time movement of transactions between heterogeneous systems. These two products are completely complementary where GoldenGate provides low-impact real-time change data capture and delivery to a staging area on the target. And Oracle Data Integrator transforms this data and loads the DW tables. In fact, Oracle Data Integrator integrates with GoldenGate to use GoldenGate’s Capture process as one option for its CDC mechanism. We have several customers that deployed GoldenGate and ODI together to feed real-time data to their data warehousing solutions. Please also check out Oracle Data Integrator Changed Data Capture with Oracle GoldenGate Data Sheet (PDF). Thank you again very much for welcoming Oracle GoldenGate 11g Release 2 and stay in touch with us for more exciting news, updates, and events.

    Read the article

  • Unlocking Productivity

    - by Michael Snow
    Unlocking Productivity in Life Sciences with Consolidated Content Management by Joe Golemba, Vice President, Product Management, Oracle WebCenter As life sciences organizations look to become more operationally efficient, the ability to effectively leverage information is a competitive advantage. Whether data mining at the drug discovery phase or prepping the sales team before a product launch, content management can play a key role in developing, organizing, and disseminating vital information. The goal of content management is relatively straightforward: put the information that people need where they can find it. A number of issues can complicate this; information sits in many different systems, each of those systems has its own security, and the information in those systems exists in many different formats. Identifying and extracting pertinent information from mountains of farflung data is no simple job, but the alternative—wasted effort or even regulatory compliance issues—is worse. An integrated information architecture can enable health sciences organizations to make better decisions, accelerate clinical operations, and be more competitive. Unstructured data matters Often when we think of drug development data, we think of structured data that fits neatly into one or more research databases. But structured data is often directly supported by unstructured data such as experimental protocols, reaction conditions, lot numbers, run times, analyses, and research notes. As life sciences companies seek integrated views of data, they are typically finding diverse islands of data that seemingly have no relationship to other data in the organization. Information like sales reports or call center reports can be locked into siloed systems, and unavailable to the discovery process. Additionally, in the increasingly networked clinical environment, Web pages, instant messages, videos, scientific imaging, sales and marketing data, collaborative workspaces, and predictive modeling data are likely to be present within an organization, and each source potentially possesses information that can help to better inform specific efforts. Historically, content management solutions that had 21CFR Part 11 capabilities—electronic records and signatures—were focused mainly on content-enabling manufacturing-related processes. Today, life sciences companies have many standalone repositories, requiring different skills, service level agreements, and vendor support costs to manage them. With the amount of content doubling every three to six months, companies have recognized the need to manage unstructured content from the beginning, in order to increase employee productivity and operational efficiency. Using scalable and secure enterprise content management (ECM) solutions, organizations can better manage their unstructured content. These solutions can also be integrated with enterprise resource planning (ERP) systems or research systems, making content available immediately, in the context of the application and within the flow of the employee’s typical business activity. Administrative safeguards—such as content de-duplication—can also be applied within ECM systems, so documents are never recreated, eliminating redundant efforts, ensuring one source of truth, and maintaining content standards in the organization. Putting it in context Consolidating structured and unstructured information in a single system can greatly simplify access to relevant information when it is needed through contextual search. Using contextual filters, results can include therapeutic area, position in the value chain, semantic commonalities, technology-specific factors, specific researchers involved, or potential business impact. The use of taxonomies is essential to organizing information and enabling contextual searches. Taxonomy solutions are composed of a hierarchical tree that defines the relationship between different life science terms. When overlaid with additional indexing related to research and/or business processes, it becomes possible to effectively narrow down the amount of data that is returned during searches, as well as prioritize results based on specific criteria and/or prior search history. Thus, search results are more accurate and relevant to an employee’s day-to-day work. For example, a search for the word "tissue" by a lab researcher would return significantly different results than a search for the same word performed by someone in procurement. Of course, diverse data repositories, combined with the immense amounts of data present in an organization, necessitate that the data elements be regularly indexed and cached beforehand to enable reasonable search response times. In its simplest form, indexing of a single, consolidated data warehouse can be expected to be a relatively straightforward effort. However, organizations require the ability to index multiple data repositories, enabling a single search to reference multiple data sources and provide an integrated results listing. Security and compliance Beyond yielding efficiencies and supporting new insight, an enterprise search environment can support important security considerations as well as compliance initiatives. For example, the systems enable organizations to retain the relevance and the security of the indexed systems, so users can only see the results to which they are granted access. This is especially important as life sciences companies are working in an increasingly networked environment and need to provide secure, role-based access to information across multiple partners. Although not officially required by the 21 CFR Part 11 regulation, the U.S. Food and Drug Administraiton has begun to extend the type of content considered when performing relevant audits and discoveries. Having an ECM infrastructure that provides centralized management of all content enterprise-wide—with the ability to consistently apply records and retention policies along with the appropriate controls, validations, audit trails, and electronic signatures—is becoming increasingly critical for life sciences companies. Making the move Creating an enterprise-wide ECM environment requires moving large amounts of content into a single enterprise repository, a daunting and risk-laden initiative. The first key is to focus on data taxonomy, allowing content to be mapped across systems. The second is to take advantage new tools which can dramatically speed and reduce the cost of the data migration process through automation. Additional content need not be frozen while it is migrated, enabling productivity throughout the process. The ability to effectively leverage information into success has been gaining importance in the life sciences industry for years. The rapid adoption of enterprise content management, both in operational processes as well as in scientific management, are clear indicators that the companies are looking to use all available data to be better informed, improve decision making, minimize risk, and increase time to market, to maintain profitability and be more competitive. As more and more varieties and sources of information are brought under the strategic management umbrella, the ability to divine knowledge from the vast pool of information is increasingly difficult. Simple search engines and basic content management are increasingly unable to effectively extract the right information from the mountains of data available. By bringing these tools into context and integrating them with business processes and applications, we can effectively focus on the right decisions that make our organizations more profitable. More Information Oracle will be exhibiting at DIA 2012 in Philadelphia on June 25-27. Stop by our booth Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;} (#2825) to learn more about the advantages of a centralized ECM strategy and see the Oracle WebCenter Content solution, our 21 CFR Part 11 compliant content management platform.

    Read the article

  • How can I force a merge of all WAL files in pg_xlog back into my base "data" directory?

    - by Zac B
    Question: Is there a way to tell Postgres (9.2) to "merge all WAL files in pg_xlog back into the non-WAL data files, and then delete all WAL files successfully merged?" I would like to be able to "force" this operation; i.e. checkpoint_segments or archiving settings should be ignored. The filesystem WAL buffer (pg_xlog) directory should be emptied, or nearly emptied. It's fine if some or all of the space consumed by the pg_xlog directory is then consumed by the data directory; our DBA has asked for WAL database backups without any backlogged WALs, but space consumption is not a concern. Having near-zero WAL activity during this operation is a fine constraint. I can ensure that the database server is either shut down or not connectible (zero user-generated transaction load) during this process. Essentially, I'd like Postgres to ignore archiving/checkpoint retention policies temporarily, and flush all WAL activity to the core database files, leaving pg_xlog in the same state as if the database were recently created--with very few WAL files. What I've Tried: I know that the pg_basebackup utility performs something like this (it generates an almost-all-WALs-merged copy of a Postgres instance's data directory), but we aren't ready to use it on all our systems yet, as we are still testing replication settings; I'm hoping for a more short-term solution. I've tried issuing CHECKPOINT commands, but they just recycle one WAL file and replace it with another (that is, if they do anything at all; if I issue them during database idle time, they do nothing). pg_switch_xlog() similarly just forces a switch to the next log segment; it doesn't flush all queued/buffered segments. I've also played with the pg_resetxlog utility. That utility sort of does what I want, but all of its usage docs seem to indicate that it destroys (rather than flushing out of the transaction log and into the main data files) some or all of the WAL data. Is that impression accurate? If not, can I use pg_resetxlog during a zero-WAL-activity period to force a flush of all queued WAL data to non-WAL data? If the answer to that is negative, how can I achieve this goal? Thanks!

    Read the article

< Previous Page | 431 432 433 434 435 436 437 438 439 440 441 442  | Next Page >