Search Results

Search found 480 results on 20 pages for 'estimate'.

Page 18/20 | < Previous Page | 14 15 16 17 18 19 20  | Next Page >

  • UML assignment question

    - by waitinforatrain
    Hi guys, Sorry, I know this is a very lame question to ask and not of any use to anyone else. I have an assignment in UML due tomorrow and I don't even know the basics (all-nighter ahead!). I'm not looking for a walkthrough, I simply want your opinion on something. The assignment is as follows (you only need to skim over it!): ============= Gourmet Surprise (GS) is a small catering firm with five employees. During a typical weekend, GS caters fifteen events with twenty to fifty people each. The business has grown rapidly over the past year and the owner wants to install a new computer system for managing the ordering and buying process. GS has a set of ten standard menus. When potential customers call, the receptionist describes the menus to them. If the customer decides to book an event (dinner, lunch, picnic, finger food etc.), the receptionist records the customer information (e.g., name, address, phone number, etc.) and the information about the event (e.g., place, date, time, which one of the standard menus, total price) on a contract. The customer is then faxed a copy of the contract and must sign and return it along with a deposit (often a credit card or by check) before the event is officially booked. The remaining money is collected when the catering is delivered. Sometimes, the customer wants something special (e.g., birthday cake). In this case, the receptionist takes the information and gives it to the owner who determines the cost; the receptionist then calls the customer back with the price information. Sometimes the customer accepts the price, other times, the customer requests some changes that have to go back to the owner for a new cost estimate. Each week, the owner looks through the events scheduled for that weekend and orders the supplies (e.g., plates) and food (e.g., bread, chicken) needed to make them. The owner would like to use the system for marketing as well. It should be able to track how customers learned about GS, and identify repeat customers, so that GS can mail special offers to them. The owner also wants to track the events on which GS sent a contract, but the customer never signed the contract and actually booked a GS. Exercise: Create an activity diagram and a use case model (complete with a set of detail use case descriptions) for the above system. Produce an initial domain model (class diagram) based on these descriptions. Elaborate the use cases into sequence diagrams, and include any state diagrams necessary. Finally use the information from these dynamic models to expand the domain model into a full application model. ============= In your opinion, do you think this question is asking me to come up with a package for an online ordering system to replace the system described above, or to create UML diagrams that facilitate the existing telephone-based system?

    Read the article

  • Please give the solution of the following programs in R Programming

    - by NEETHU
    Table below gives data concerning the performance of 28 national football league teams in 1976.It is suspected that the no. of yards gained rushing by opponents(x8) has an effect on the no. of games won by a team(y) (a)Fit a simple linear regression model relating games won by y to yards gained rushing by opponents x8. (b)Construct the analysis of variance table and test for significance of regression. (c)Find a 95% CI on the slope. (d)What percent of the total variability in y is explained by this model. (e)Find a 95% CI on the mean number of games won in opponents yards rushing is limited to 2000 yards. Team y x8 1 10 2205 2 11 2096 3 11 1847 4 13 1803 5 10 1457 6 11 1848 7 10 1564 8 11 1821 9 4 2577 10 2 2476 11 7 1984 12 10 1917 13 9 1761 14 9 1709 15 6 1901 16 5 2288 17 5 2072 18 5 2861 19 6 2411 20 4 2289 21 3 2203 22 3 2592 23 4 2053 24 10 1979 25 6 2048 26 8 1786 27 2 2876 28 0 2560 Suppose we would like to use the model developed in problem 1 to predict the no. of games a team will win if it can limit opponents yards rushing to 1800 yards. Find a point estimate of the no. of games won when x8=1800.Find a 905 prediction interval on the no. of games won. The purity of Oxygen produced by a fractionation process is thought to be percentage of Hydrocarbon in the main condenser of the processing unit .20 samples are shown below. Purity(%) Hydrocarbon(%) 86.91 1.02 89.85 1.11 90.28 1.43 86.34 1.11 92.58 1.01 87.33 0.95 86.29 1.11 91.86 0.87 95.61 1.43 89.86 1.02 96.73 1.46 99.42 1.55 98.66 1.55 96.07 1.55 93.65 1.4 87.31 1.15 95 1.01 96.85 0.99 85.2 0.95 90.56 0.98 (a)Fit a simple linear regression model to the data. (b)Test the hypothesis H0:ß=0 (c)Calculate R2 . (d)Find a 95% CI on the slope. (e)Find a 95% CI on the mean purity and the Hydrocarbon % is 1. Consider the Oxygen plant data in Problem3 and assume that purity and Hydrocarbon percentage are jointly normally distributed r.vs (a)What is the correlation between Oxygen purity and Hydrocarbon% (b)Test the hypothesis that ?=0. (c)Construct a 95% CI for ?.

    Read the article

  • Importing a large dataset into a database

    - by peaceful
    I'm a beginning programmer in the relevant areas to this question, so if possible, it'd be helpful to avoid assuming I know a lot already. I'm trying to import the OpenLibrary dataset into a local Postgres database. After it's imported, I plan to use it as a starting seed for a Ruby on Rails application that will include information on books. The OpenLibrary datasets are available here, in a modified JSON format: http://openlibrary.org/dev/docs/jsondump I only need very basic information for my application, much less than what is provided in the dumps. I'm only trying to get out book titles, author names, and relationships between books and authors. Below are two typical entries from their dataset, the first for an author, and the second for a book (they seem to have an entry for each edition of a book). The entries seem to lead off with a primary key, and then with a type, before including the actual JSON database dump. /a/OL2A /type/author {"name": "U. Venkatakrishna Rao", "personal_name": "U. Venkatakrishna Rao", "last_modified": {"type": "/type/datetime", "value": "2008-09-10 08:44:01.978456"}, "key": "/a/OL2A", "birth_date": "1904", "type": {"key": "/type/author"}, "id": 99, "revision": 3} /b/OL345M /type/edition {"publishers": ["Social Science Research Project, Dept. of Geography, University of Dacca"], "pagination": "ii, 54 p.", "title": "Land use in Fayadabad area", "lccn": ["sa 65000491"], "subject_place": ["East Pakistan", "Dacca region."], "number_of_pages": 54, "languages": [{"comment": "initial import", "code": "eng", "name": "English", "key": "/l/eng"}], "lc_classifications": ["S471.P162 E23"], "publish_date": "1963", "publish_country": "pk ", "key": "/b/OL345M", "authors": [{"birth_date": "1911", "name": "Nafis Ahmad", "key": "/a/OL302A", "personal_name": "Nafis Ahmad"}], "publish_places": ["Dacca, East Pakistan"], "by_statement": "[by] Nafis Ahmad and F. Karim Khan.", "oclc_numbers": ["4671066"], "contributions": ["Khan, Fazle Karim, joint author."], "subjects": ["Land use -- East Pakistan -- Dacca region."]} The size of the uncompressed dumps are enormous, about 2GB for the authors list, and 18GB for the book editions list. OpenLibrary does not provide any tools for this themselves, they provide a simple unoptimized Python script for reading in sample data (which unlike the actual dumps comes in pure JSON format), but they estimate if that was modified for use on their actual data it would take 2 months (!) to finish loading the data. How can I read this into the database? I assume I'll need to write a program to do this. What language and any guidance on how I should do it to finish in a reasonable amount of time? The only scripting language I have any experience with is Ruby.

    Read the article

  • Find location using only distance and range?

    - by pinnacler
    Triangulation works by checking your angle to three KNOWN targets. "I know the that's the Lighthouse of Alexandria, it's located here (X,Y) on a map, and it's to my right at 90 degrees." Repeat 2 more times for different targets and angles. Trilateration works by checking your distance from three KNOWN targets. "I know the that's the Lighthouse of Alexandria, it's located here (X,Y) on a map, and I'm 100 meters away from that." Repeat 2 more times for different targets and ranges. But both of those methods rely on knowing WHAT you're looking at. Say you're in a forest and you can't differentiate between trees, but you know where key trees are. These trees have been hand picked as "landmarks." You have a robot moving through that forest slowly. Do you know of any ways to determine location based solely off of angle and range, exploiting geometry between landmarks? Note, you will see other trees as well, so you won't know which trees are key trees. Ignore the fact that a target may be occluded. Our pre-algorithm takes care of that. 1) If this exists, what's it called? I can't find anything. 2) What do you think the odds are of having two identical location 'hits?' I imagine it's fairly rare. 3) If there are two identical location 'hits,' how can I determine my exact location after I move the robot next. (I assume the chances of having 2 occurrences of EXACT angles in a row, after I reposition the robot, would be statistically impossible, barring a forest growing in rows like corn). Would I just calculate the position again and hope for the best? Or would I somehow incorporate my previous position estimate into my next guess? If this exists, I'd like to read about it, and if not, develop it as a side project. I just don't have time to reinvent the wheel right now, nor have the time to implement this from scratch. So if it doesn't exist, I'll have to figure out another way to localize the robot since that's not the aim of this research, if it does, lets hope it's semi-easy.

    Read the article

  • Workflow for statistical analysis and report writing

    - by ws
    Does anyone have any wisdom on workflows for data analysis related to custom report writing? The use-case is basically this: Client commissions a report that uses data analysis, e.g. a population estimate and related maps for a water district. The analyst downloads some data, munges the data and saves the result (e.g. adding a column for population per unit, or subsetting the data based on district boundaries). The analyst analyzes the data created in (2), gets close to her goal, but sees that needs more data and so goes back to (1). Rinse repeat until the tables and graphics meet QA/QC and satisfy the client. Write report incorporating tables and graphics. Next year, the happy client comes back and wants an update. This should be as simple as updating the upstream data by a new download (e.g. get the building permits from the last year), and pressing a "RECALCULATE" button, unless specifications change. At the moment, I just start a directory and ad-hoc it the best I can. I would like a more systematic approach, so I am hoping someone has figured this out... I use a mix of spreadsheets, SQL, ARCGIS, R, and Unix tools. Thanks! PS: Below is a basic Makefile that checks for dependencies on various intermediate datasets (w/ ".RData" suffix) and scripts (".R" suffix). Make uses timestamps to check dependencies, so if you 'touch ss07por.csv', it will see that this file is newer than all the files / targets that depend on it, and execute the given scripts in order to update them accordingly. This is still a work in progress, including a step for putting into SQL database, and a step for a templating language like sweave. Note that Make relies on tabs in its syntax, so read the manual before cutting and pasting. Enjoy and give feedback! http://www.gnu.org/software/make/manual/html%5Fnode/index.html#Top R=/home/wsprague/R-2.9.2/bin/R persondata.RData : ImportData.R ../../DATA/ss07por.csv Functions.R $R --slave -f ImportData.R persondata.Munged.RData : MungeData.R persondata.RData Functions.R $R --slave -f MungeData.R report.txt: TabulateAndGraph.R persondata.Munged.RData Functions.R $R --slave -f TabulateAndGraph.R report.txt

    Read the article

  • What are the types and inner workings of a query optimizer?

    - by Frank Developer
    As I understand it, most query optimizers are cost-based. Some can be influenced by hints like FIRST_ROWS(). Others are tailored for OLAP. Is it possible to know more detailed logic about how Informix IDS and SE's optimizers decide what's the best route for processing a query, other than SET EXPLAIN? Is there any documentation which illustrates the ranking of SELECT statements? I would imagine that "SELECT col FROM table WHERE ROWID = n" ranks 1st. What are the rest of them?.. If I'm not mistaking, Informix's ROWID is a SERIAL(INT) which allows for a max. of 2GB nrows, or maybe it uses INT9 for TB's nrows?.. However, I think Oracle uses HEX values for ROWID. Too bad ROWID can't be oftenly used, since a rows ROWID can change. So maybe ROWID is used by the optimizer as a counter? Perhaps, it could be used for implementing the query progress idea I mentioned in my "Begin viewing query results before query completes" question? For some reason, I feel it wouldn't be that difficult to report a query's progress while being processed, perhaps at the expense of some slight overhead, but it would be nice to know ahead of time: A "Google-like" estimate of how many rows meet a query's criteria, display it's progress every 100, 200, 500 or 1,000 rows, give users the ability to cancel it at anytime and start displaying the qualifying rows as they are being put into the current list, while it continues searching?.. This is just one example, perhaps we could think other neat/useful features, the ingridients are more or less there. Perhaps we could fine-tune each query with more granularity than currently available? OLTP queries tend to be mostly static and pre-defined. The "what-if's" are more OLAP, so let's try to add more control and intelligence to it? So, therefore, being able to more precisely control, not "hint-influence" a query is what's needed and therefore it would be necessary to know how the optimizers logic is programmed. We can then have Dynamic SELECT and other statements for specific situations! Maybe even tell IDS to read blocks of indexes nodes at-a-time instead of one-by-one, etc. etc.

    Read the article

  • SET game odds simulation (MATLAB)

    - by yuk
    Here is an interesting problem for your weekend. :) I recently find the great card came - SET. Briefly, there are 81 cards with the four features: symbol (oval, squiggle or diamond), color (red, purple or green), number (one, two or three) or shading (solid, striped or open). The task is to find (from selected 12 cards) a SET of 3 cards, in which each of the four features is either all the same on each card or all different on each card (no 2+1 combination). In my free time I've decided to code it in MATLAB to find a solution and to estimate odds of having a set in randomly selected cards. Here is the code: %% initialization K = 12; % cards to draw NF = 4; % number of features (usually 3 or 4) setallcards = unique(nchoosek(repmat(1:3,1,NF),NF),'rows'); % all cards: rows - cards, columns - features setallcomb = nchoosek(1:K,3); % index of all combinations of K cards by 3 %% test tic NIter=1e2; % number of test iterations setexists = 0; % test results holder % C = progress('init'); % if you have progress function from FileExchange for d = 1:NIter % C = progress(C,d/NIter); % cards for current test setdrawncardidx = randi(size(setallcards,1),K,1); setdrawncards = setallcards(setdrawncardidx,:); % find all sets in current test iteration for setcombidx = 1:size(setallcomb,1) setcomb = setdrawncards(setallcomb(setcombidx,:),:); if all(arrayfun(@(x) numel(unique(setcomb(:,x))), 1:NF)~=2) % test one combination setexists = setexists + 1; break % to find only the first set end end end fprintf('Set:NoSet = %g:%g = %g:1\n', setexists, NIter-setexists, setexists/(NIter-setexists)) toc 100-1000 iterations are fast, but be careful with more. One million iterations takes about 15 hours on my home computer. Anyway, with 12 cards and 4 features I've got around 13:1 of having a set. This is actually a problem. The instruction book said this number should be 33:1. And it was recently confirmed by Peter Norvig. He provides the Python code, but I didn't test it. So can you find an error?

    Read the article

  • Find location using only distance and bearing?

    - by pinnacler
    Triangulation works by checking your angle to three KNOWN targets. "I know the that's the Lighthouse of Alexandria, it's located here (X,Y) on a map, and it's to my right at 90 degrees." Repeat 2 more times for different targets and angles. Trilateration works by checking your distance from three KNOWN targets. "I know the that's the Lighthouse of Alexandria, it's located here (X,Y) on a map, and I'm 100 meters away from that." Repeat 2 more times for different targets and ranges. But both of those methods rely on knowing WHAT you're looking at. Say you're in a forest and you can't differentiate between trees, but you know where key trees are. These trees have been hand picked as "landmarks." You have a robot moving through that forest slowly. Do you know of any ways to determine location based solely off of angle and range, exploiting geometry between landmarks? Note, you will see other trees as well, so you won't know which trees are key trees. Ignore the fact that a target may be occluded. Our pre-algorithm takes care of that. 1) If this exists, what's it called? I can't find anything. 2) What do you think the odds are of having two identical location 'hits?' I imagine it's fairly rare. 3) If there are two identical location 'hits,' how can I determine my exact location after I move the robot next. (I assume the chances of having 2 occurrences of EXACT angles in a row, after I reposition the robot, would be statistically impossible, barring a forest growing in rows like corn). Would I just calculate the position again and hope for the best? Or would I somehow incorporate my previous position estimate into my next guess? If this exists, I'd like to read about it, and if not, develop it as a side project. I just don't have time to reinvent the wheel right now, nor have the time to implement this from scratch. So if it doesn't exist, I'll have to figure out another way to localize the robot since that's not the aim of this research, if it does, lets hope it's semi-easy.

    Read the article

  • iPhone Image Resources, ICO vs PNG, app bundle filesize

    - by Jasarien
    My application has a collection of around 1940 icons that are used throughout. They're currently in ICO and new images provided to me come in ICO format too. I have noticed that they contain a 16x16 and 32x32 representation of each icon in one file. Each file is roughly 4KB in filesize (as reported by finder, but ls reports that they vary from being ~1000 bytes to 5000 bytes) A very small number of these icons only contain the 32x32 representation, and as a result are only around 700 bytes in size. Currently I am bundling these icons with my application and they are inflating the size of the app a bit more than I would like. Altogether, the images total just about 25.5MB. Xcode must do some kind of compression because the resulting app bundle is about 12.4MB. Compressing this further into a ZIP (as it would be when submitted to the App Store), results in a final file of 5.8MB. I'm aware that the maximum limit for over the air App Store downloads has been raised to 20MB since the introduction of the iPad (I'm not sure if that extends to iPhone apps as well as iPad apps though, if not the limit would be 10MB). My worry is that new icons are going to be added (sometimes up to 10 icons per week), and will continue to inflate the app bundle over time. What is the best way to distribute these icons with my app? Things I've tried and not had much success with: Converting the icons from ICO to PNG: I tried this in the hopes that the pngcrush utility would help out with the filesize. But it appears that it doesn't make much of a difference between a normal PNG and a crushed png (I believe it just optimises the image for display on the iPhone's GPU rather than compress it's size). Also in going from ICO to PNG actually increased the size of the icon file... Zipping the images, and then uncompressing them on first run. While this did reduce the overall image sizes, I found that the effort needed to unzip them, copy them to the documents folder and ensure that duplication doesn't happen on upgrades was too much hassle to be worth the benefit. Also, on original and 3G iPhones unzipping and copying around 25MB of images takes too long and creates a bad experience... Things I've considered but not yet tried: Instead of distributing the icons within the app bundle, host them online, and download each icon on demand (it depends on the user's data as to which icons will actually be displayed and when). Issues with this is that bandwidth costs money, and image downloads will be bandwidth intensive. However, my app currently has a small userbase of around 5,500 users (of which I estimate around 1500 to be active based on Flurry stats), and I have a huge unused bandwidth allowance with my current hosting package. So I'm open to thoughts on how to solve this tricky issue.

    Read the article

  • Can MySQL reasonably perform queries on billions of rows?

    - by haxney
    I am planning on storing scans from a mass spectrometer in a MySQL database and would like to know whether storing and analyzing this amount of data is remotely feasible. I know performance varies wildly depending on the environment, but I'm looking for the rough order of magnitude: will queries take 5 days or 5 milliseconds? Input format Each input file contains a single run of the spectrometer; each run is comprised of a set of scans, and each scan has an ordered array of datapoints. There is a bit of metadata, but the majority of the file is comprised of arrays 32- or 64-bit ints or floats. Host system |----------------+-------------------------------| | OS | Windows 2008 64-bit | | MySQL version | 5.5.24 (x86_64) | | CPU | 2x Xeon E5420 (8 cores total) | | RAM | 8GB | | SSD filesystem | 500 GiB | | HDD RAID | 12 TiB | |----------------+-------------------------------| There are some other services running on the server using negligible processor time. File statistics |------------------+--------------| | number of files | ~16,000 | | total size | 1.3 TiB | | min size | 0 bytes | | max size | 12 GiB | | mean | 800 MiB | | median | 500 MiB | | total datapoints | ~200 billion | |------------------+--------------| The total number of datapoints is a very rough estimate. Proposed schema I'm planning on doing things "right" (i.e. normalizing the data like crazy) and so would have a runs table, a spectra table with a foreign key to runs, and a datapoints table with a foreign key to spectra. The 200 Billion datapoint question I am going to be analyzing across multiple spectra and possibly even multiple runs, resulting in queries which could touch millions of rows. Assuming I index everything properly (which is a topic for another question) and am not trying to shuffle hundreds of MiB across the network, is it remotely plausible for MySQL to handle this? UPDATE: additional info The scan data will be coming from files in the XML-based mzML format. The meat of this format is in the <binaryDataArrayList> elements where the data is stored. Each scan produces = 2 <binaryDataArray> elements which, taken together, form a 2-dimensional (or more) array of the form [[123.456, 234.567, ...], ...]. These data are write-once, so update performance and transaction safety are not concerns. My naïve plan for a database schema is: runs table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | start_time | TIMESTAMP | | name | VARCHAR | |-------------+-------------| spectra table | column name | type | |----------------+-------------| | id | PRIMARY KEY | | name | VARCHAR | | index | INT | | spectrum_type | INT | | representation | INT | | run_id | FOREIGN KEY | |----------------+-------------| datapoints table | column name | type | |-------------+-------------| | id | PRIMARY KEY | | spectrum_id | FOREIGN KEY | | mz | DOUBLE | | num_counts | DOUBLE | | index | INT | |-------------+-------------| Is this reasonable?

    Read the article

  • Have I pushed the limits of my current VPS or is there room for optimization?

    - by JRameau
    I am currently on a mediatemple DV server (basic) 512mb dedicated ram, this is a CentOS based VPS with Plesk and Virtuozzo. My experience with it from day 1 has been bad and I only could sooth my server issues with several caching "Band-aids," but my sites are not as small as they were a year ago either so the issues have worsen. I have 3 Drupal installs running on separate (plesk) domains, 1 of those drupal installs is a multisite, that consists of 5-6 sites 2 of those sites are bringing in actual traffic. Those caching "Band-aids" I mentioned are APC, which seemed to help alot initially, and Drupal's Boost, which is considered a poorman's Varnish, it makes all my pages static for anonymous users. Last 30day combined estimate on Google Ananlytics: 90k visitors 260k pageviews. Issue: alot of downtime, I am continually checking if my sites are up, and lately I have been finding it down more than 3 times daily. Restarting Apache will bring it back up, for some time. I have google search every error message and looked up ways to optimize my DV server, and I am beyond stump what is my next move. Is this server bad, have I hit a impossibly low restriction such as the 12mb kernel memory barrier (kmemsize), is it on my end, do I need to optimize some more? *I have provided as much information as I can below, any help or suggestions given will be appreciated Common Error messages I see in the log: [error] (12)Cannot allocate memory: fork: Unable to fork new process [error] make_obcallback: could not import mod_python.apache.\n Traceback (most recent call last): File "/usr/lib/python2.4/site-packages/mod_python/apache.py", line 21, in ? import traceback File "/usr/lib/python2.4/traceback.py", line 3, in ? import linecache ImportError: No module named linecache [error] python_handler: no interpreter callback found. [warn-phpd] mmap cache can't open /var/www/vhosts/***/httpdocs/*** - Too many open files in system (pid ***) [alert] Child 8125 returned a Fatal error... Apache is exiting! [emerg] (43)Identifier removed: couldn't grab the accept mutex [emerg] (22)Invalid argument: couldn't release the accept mutex cat /proc/user_beancounters: Version: 2.5 uid resource held maxheld barrier limit failcnt 41548: kmemsize 4582652 5306699 12288832 13517715 21105036 lockedpages 0 0 600 600 0 privvmpages 38151 42676 229036 249036 0 shmpages 16274 16274 17237 17237 2 dummy 0 0 0 0 0 numproc 43 46 300 300 0 physpages 27260 29528 0 2147483647 0 vmguarpages 0 0 131072 2147483647 0 oomguarpages 27270 29538 131072 2147483647 0 numtcpsock 21 29 300 300 0 numflock 8 8 480 528 0 numpty 1 1 30 30 0 numsiginfo 0 1 1024 1024 0 tcpsndbuf 648440 675272 2867477 4096277 1711499 tcprcvbuf 301620 359716 2867477 4096277 0 othersockbuf 4472 4472 1433738 2662538 0 dgramrcvbuf 0 0 1433738 1433738 0 numothersock 12 12 300 300 0 dcachesize 0 0 2684271 2764800 0 numfile 3447 3496 6300 6300 3872 dummy 0 0 0 0 0 dummy 0 0 0 0 0 dummy 0 0 0 0 0 numiptent 14 14 200 200 0 TOP: (In January the load avg was really high 3-10, I was able to bring it down where it is currently is by giving APC more memory play around with) top - 16:46:07 up 2:13, 1 user, load average: 0.34, 0.20, 0.20 Tasks: 40 total, 2 running, 37 sleeping, 0 stopped, 1 zombie Cpu(s): 0.3% us, 0.1% sy, 0.0% ni, 99.7% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 916144k total, 156668k used, 759476k free, 0k buffers Swap: 0k total, 0k used, 0k free, 0k cached MySQLTuner: (after optimizing every table and repairing any table with overage I got the fragmented count down to 86) [--] Data in MyISAM tables: 285M (Tables: 1105) [!!] Total fragmented tables: 86 [--] Up for: 2h 44m 38s (409K q [41.421 qps], 6K conn, TX: 1B, RX: 174M) [--] Reads / Writes: 79% / 21% [--] Total buffers: 58.0M global + 2.7M per thread (100 max threads) [!!] Query cache prunes per day: 675307 [!!] Temporary tables created on disk: 35% (7K on disk / 20K total)

    Read the article

  • Is the Cloud ready for an Enterprise Java web application? Seeking a JEE hosting advice.

    - by Jakub Holý
    Greetings to all the smart people around here! I'd like to ask whether it is feasible or a good idea at all to deploy a Java enterprise web application to a Cloud such as Amazon EC2. More exactly, I'm looking for infrastructure options for an application that shall handle few hundred users with long but neither CPU nor memory intensive sessions. I'm considering dedicated servers, virtual private servers (VPSs) and EC2. I've noticed that there is a project called JBoss Cloud so people are working on enabling such a deployment, on the other hand it doesn't seem to be mature yet and I'm not sure that the cloud is ready for this kind of applications, which differs from the typical cloud-based applications like Twitter. Would you recommend to deploy it to the cloud? What are the pros and cons? The application is a Java EE 5 web application whose main function is to enable users to compose their own customized Product by combining the available Parts. It uses stateless and stateful session beans and JPA for persistence of entities to a RDBMS and fetches information about Parts from the company's inventory system via a web service. Aside of external users it's used also by few internal ones, who are authenticated against the company's LDAP. The application should handle around 300-400 concurrent users building their product and should be reasonably scalable and available though these qualities are only of a medium importance at this stage. I've proposed an architecture consisting of a firewall (FW) and load balancer supporting sticky sessions and https (in the Cloud this would be replaced with EC2's Elastic Load Balancing service and FW on the app. servers, in a physical architecture the load-balancer would be a HW), then two physical clustered application servers combined with web servers (so that if one fails, a user doesn't loose his/her long built product) and finally a database server. The DB server would need a slave backup instance that can replace the master instance if it fails. This should provide reasonable availability and fault tolerance and provide good scalability as long as a single RDBMS can keep with the load, which should be OK for quite a while because most of the operations are done in the memory using a stateful bean and only occasionally stored or retrieved from the DB and the amount of data is low too. A problematic part could be the dependency on the remote inventory system webservice but with good caching of its outputs in the application it should be OK too. Unfortunately I've only vague idea of the system resources (memory size, number and speed of CPUs/cores) that such an "average Java EE application" for few hundred users needs. My rough and mostly unfounded estimate based on actual Amazon offerings is that 1.7GB and a single, 2-core "modern CPU" with speed around 2.5GHz (the High-CPU Medium Instance) should be sufficient for any of the two application servers (since we can handle higher load by provisioning more of them). Alternatively I would consider using the Large instance (64b, 7.5GB RAM, 2 cores at 1GHz) So my question is whether such a deployment to the cloud is technically and financially feasible or whether dedicated/VPS servers would be a better option and whether there are some real-world experiences with something similar. Thank you very much! /Jakub Holy PS: I've found the JBoss EAP in a Cloud Case Study that shows that it is possible to deploy a real-world Java EE application to the EC2 cloud but unfortunately there're no details regarding topology, instance types, or anything :-(

    Read the article

  • Magento hosting on a budget

    - by spa
    I have to do a setup for Magento. My constraint is primarily ease of setup and fault tolerance/fail over. Furthermore costs are an issue. I have three identical physical servers to get the job done. Each server node has an i7 quad core, 16GB RAM, and 2x3TB HD in a software RAID 1 configuration. Each node runs Ubuntu 12.04. right now. I have an additional IP address which can be routed to any of these nodes. The Magento shop has max. 1000 products, 50% of it are bundle products. I would estimate that max. 100 users are active at once. This leads me to the conclusion, that performance is not top priority here. My first setup idea One node (lb) runs nginx as a load balancer. The additional IP is used with domain name and routed to this node by default. Nginx distributes the load equally to the other two nodes (shop1, shop2). Shop1 and shop2 are configured equally: each server runs Apache2 and MySQL. The Mysqls are configured with master/slave replication. My failover strategy: Lb fails = Route IP to shop1 (MySQL master), continue. Shop1 fails = Lb will handle that automatically, promote MySQL slave on shop2 to master, reconfigure Magento to use shop2 for writes, continue. Shop2 fails = Lb will handle that automatically, continue. Is this a sane strategy? Has anyone done a similar setup with Magento? My second setup idea Another way to do it would be to use drbd for storing the MySQL data files on shop1 and shop2. I understand that in this scenario only one node/MySQL instance can be active and the other is used as hot standby. So in case shop1 fails, I would start up MySQL on shop2, route the IP to shop2, and continue. I like that as the MySQL setup is easier and the nodes can be configured 99% identical. So in this case the load balancer becomes useless and I have a spare server. My third setup idea The third way might be master-master replication of MySQL databases. However, in my optinion this might be tricky, as Magento isn't build for this scenario (e.g. conflicting ids for new rows). I would not do that until I have heard of a working example. Could you give me an advice which route to follow? There seems not one "good" way to do it. E.g. I read blog posts which describe a MySQL master/slave setup for Magento, but elsewhere I read, that data might get duplicated when the slave lags behind the master (e.g. when an order is placed, a customer might get created twice). I'm kind of lost here.

    Read the article

  • Windows 7 sometimes boots in VGA mode

    - by TuxRug
    I have an Asus G50VT-x5 laptop with nVidia GeForce9800M-GS graphics. Normally, Windows boots normally, but about 20% of the time (rough estimate), it will boot with the fallback VGA driver, maxing out at 800x600 with no Aero. I've checked the system logs and there is nothing indicating an error loading the nVidia driver. It even specifies in the logs that the Nvidia Display Driver service started successfully, even though it has booted in safe graphics mode. This has been happening for a while, but it's happening a little more often now than it was before. Since the first time my system exhibited this behavior, I have updated my graphics driver a handful of times. I used System Information for Windows to check for problems there, but the only thing that stood out was the following: Core Temperature 4486449 °C (8075639 °F) Shaders Temperature 1171513530 °C (2108724330 °F) I know this reading is incorrect, because my laptop is nowhere near the surface of the sun and my desk has not burst into flames. When it's opererating normally, I get a sane reading like [Core Temperature 58 °C (136 °F)] with no Shaders Temperature listed. All I have to do to resolve the issue is reboot. I have seen no stability issues with the graphics or anything else. A long time ago, I had an issue with this computer where my framerate would suddenly drop during a 3D game from 40fps to <1fps, but after looking at the temperature readout immediately after quitting a game, I removed the bottom panel and blew the dust out of the vent and heatsink. Since then I have no drops in framerate under any situation. I have uploaded a zip containing the SIW reports for when the problem is occurring and when the computer is operating normally. I don't have a paid account so it can only be downloaded 10 times, so please only download the reports if you think you can use them. If you try to download the reports and they are no longer available, please comment and I will re-upload them. If you want to look at the files, they are on Rapidshare. EDIT It happened again, and I looked a little deeper into the System logs. When this happens, there are a lot of errors about other device drivers unable to start. All of these errors are for PnP drivers. Also, my USB keyboard and mouse take a few moments before they actually start working, although this happens sometimes the first normal boot as well. I am quite sure this is related, so I am adding the pnp tag. Also, CHKDSK will not run on boot. Even if a check is scheduled or a volume is manually set as dirty, CHKDSK will be skipped entirely, not even leaving an entry in the System logs. I tried running CHKNTFS /D, which did not work. I then manually changed my HKLM\System\CurrentControlSet\Control\Session Manager BootExecute value to the default listed on Microsoft's website. That did not work either. I ended up booting to repair mode and running CHKDSK there, which found a number of minor inconsistencies on my system drive, but none on my data drive. I have no idea if this is related. Some more information for those who don't download my SIW report file: Antivirus and Firewall are ESET Smart Security I have three different virutalization programs installed: VMware Player, Windows Virtual PC, and VirtualBox. The network adapters for these show up in the log of failed device starts. EDIT 2 I tried running sfc /scannow, which reported that it found corrupted files that could not be fixed. The CBS log is extremely cryptic. I tried booting to my install disk, launching repair mode, and doing an offline sfc from there, which produced the same result.

    Read the article

  • How can I get penetration depth from Minkowski Portal Refinement / Xenocollide?

    - by Raven Dreamer
    I recently got an implementation of Minkowski Portal Refinement (MPR) successfully detecting collision. Even better, my implementation returns a good estimate (local minimum) direction for the minimum penetration depth. So I took a stab at adjusting the algorithm to return the penetration depth in an arbitrary direction, and was modestly successful - my altered method works splendidly for face-edge collision resolution! What it doesn't currently do, is correctly provide the minimum penetration depth for edge-edge scenarios, such as the case on the right: What I perceive to be happening, is that my current method returns the minimum penetration depth to the nearest vertex - which works fine when the collision is actually occurring on the plane of that vertex, but not when the collision happens along an edge. Is there a way I can alter my method to return the penetration depth to the point of collision, rather than the nearest vertex? Here's the method that's supposed to return the minimum penetration distance along a specific direction: public static Vector3 CalcMinDistance(List<Vector3> shape1, List<Vector3> shape2, Vector3 dir) { //holding variables Vector3 n = Vector3.zero; Vector3 swap = Vector3.zero; // v0 = center of Minkowski sum v0 = Vector3.zero; // Avoid case where centers overlap -- any direction is fine in this case //if (v0 == Vector3.zero) return Vector3.zero; //always pass in a valid direction. // v1 = support in direction of origin n = -dir; //get the differnce of the minkowski sum Vector3 v11 = GetSupport(shape1, -n); Vector3 v12 = GetSupport(shape2, n); v1 = v12 - v11; //if the support point is not in the direction of the origin if (v1.Dot(n) <= 0) { //Debug.Log("Could find no points this direction"); return Vector3.zero; } // v2 - support perpendicular to v1,v0 n = v1.Cross(v0); if (n == Vector3.zero) { //v1 and v0 are parallel, which means //the direction leads directly to an endpoint n = v1 - v0; //shortest distance is just n //Debug.Log("2 point return"); return n; } //get the new support point Vector3 v21 = GetSupport(shape1, -n); Vector3 v22 = GetSupport(shape2, n); v2 = v22 - v21; if (v2.Dot(n) <= 0) { //can't reach the origin in this direction, ergo, no collision //Debug.Log("Could not reach edge?"); return Vector2.zero; } // Determine whether origin is on + or - side of plane (v1,v0,v2) //tests linesegments v0v1 and v0v2 n = (v1 - v0).Cross(v2 - v0); float dist = n.Dot(v0); // If the origin is on the - side of the plane, reverse the direction of the plane if (dist > 0) { //swap the winding order of v1 and v2 swap = v1; v1 = v2; v2 = swap; //swap the winding order of v11 and v12 swap = v12; v12 = v11; v11 = swap; //swap the winding order of v11 and v12 swap = v22; v22 = v21; v21 = swap; //and swap the plane normal n = -n; } /// // Phase One: Identify a portal while (true) { // Obtain the support point in a direction perpendicular to the existing plane // Note: This point is guaranteed to lie off the plane Vector3 v31 = GetSupport(shape1, -n); Vector3 v32 = GetSupport(shape2, n); v3 = v32 - v31; if (v3.Dot(n) <= 0) { //can't enclose the origin within our tetrahedron //Debug.Log("Could not reach edge after portal?"); return Vector3.zero; } // If origin is outside (v1,v0,v3), then eliminate v2 and loop if (v1.Cross(v3).Dot(v0) < 0) { //failed to enclose the origin, adjust points; v2 = v3; v21 = v31; v22 = v32; n = (v1 - v0).Cross(v3 - v0); continue; } // If origin is outside (v3,v0,v2), then eliminate v1 and loop if (v3.Cross(v2).Dot(v0) < 0) { //failed to enclose the origin, adjust points; v1 = v3; v11 = v31; v12 = v32; n = (v3 - v0).Cross(v2 - v0); continue; } bool hit = false; /// // Phase Two: Refine the portal int phase2 = 0; // We are now inside of a wedge... while (phase2 < 20) { phase2++; // Compute normal of the wedge face n = (v2 - v1).Cross(v3 - v1); n.Normalize(); // Compute distance from origin to wedge face float d = n.Dot(v1); // If the origin is inside the wedge, we have a hit if (d > 0 ) { //Debug.Log("Do plane test here"); float T = n.Dot(v2) / n.Dot(dir); Vector3 pointInPlane = (dir * T); return pointInPlane; } // Find the support point in the direction of the wedge face Vector3 v41 = GetSupport(shape1, -n); Vector3 v42 = GetSupport(shape2, n); v4 = v42 - v41; float delta = (v4 - v3).Dot(n); float separation = -(v4.Dot(n)); if (delta <= kCollideEpsilon || separation >= 0) { //Debug.Log("Non-convergance detected"); //Debug.Log("Do plane test here"); return Vector3.zero; } // Compute the tetrahedron dividing face (v4,v0,v1) float d1 = v4.Cross(v1).Dot(v0); // Compute the tetrahedron dividing face (v4,v0,v2) float d2 = v4.Cross(v2).Dot(v0); // Compute the tetrahedron dividing face (v4,v0,v3) float d3 = v4.Cross(v3).Dot(v0); if (d1 < 0) { if (d2 < 0) { // Inside d1 & inside d2 ==> eliminate v1 v1 = v4; v11 = v41; v12 = v42; } else { // Inside d1 & outside d2 ==> eliminate v3 v3 = v4; v31 = v41; v32 = v42; } } else { if (d3 < 0) { // Outside d1 & inside d3 ==> eliminate v2 v2 = v4; v21 = v41; v22 = v42; } else { // Outside d1 & outside d3 ==> eliminate v1 v1 = v4; v11 = v41; v12 = v42; } } } return Vector3.zero; } }

    Read the article

  • Session memory – who’s this guy named Max and what’s he doing with my memory?

    - by extended_events
    SQL Server MVP Jonathan Kehayias (blog) emailed me a question last week when he noticed that the total memory used by the buffers for an event session was larger than the value he specified for the MAX_MEMORY option in the CREATE EVENT SESSION DDL. The answer here seems like an excellent subject for me to kick-off my new “401 – Internals” tag that identifies posts where I pull back the curtains a bit and let you peek into what’s going on inside the extended events engine. In a previous post (Option Trading: Getting the most out of the event session options) I explained that we use a set of buffers to store the event data before  we write the event data to asynchronous targets. The MAX_MEMORY along with the MEMORY_PARTITION_MODE defines how big each buffer will be. Theoretically, that means that I can predict the size of each buffer using the following formula: max memory / # of buffers = buffer size If it was that simple I wouldn’t be writing this post. I’ll take “boundary” for 64K Alex For a number of reasons that are beyond the scope of this blog, we create event buffers in 64K chunks. The result of this is that the buffer size indicated by the formula above is rounded up to the next 64K boundary and that is the size used to create the buffers. If you think visually, this means that the graph of your max_memory option compared to the actual buffer size that results will look like a set of stairs rather than a smooth line. You can see this behavior by looking at the output of dm_xe_sessions, specifically the fields related to the buffer sizes, over a range of different memory inputs: Note: This test was run on a 2 core machine using per_cpu partitioning which results in 5 buffers. (Seem my previous post referenced above for the math behind buffer count.) input_memory_kb total_regular_buffers regular_buffer_size total_buffer_size 637 5 130867 654335 638 5 130867 654335 639 5 130867 654335 640 5 196403 982015 641 5 196403 982015 642 5 196403 982015 This is just a segment of the results that shows one of the “jumps” between the buffer boundary at 639 KB and 640 KB. You can verify the size boundary by doing the math on the regular_buffer_size field, which is returned in bytes: 196403 – 130867 = 65536 bytes 65536 / 1024 = 64 KB The relationship between the input for max_memory and when the regular_buffer_size is going to jump from one 64K boundary to the next is going to change based on the number of buffers being created. The number of buffers is dependent on the partition mode you choose. If you choose any partition mode other than NONE, the number of buffers will depend on your hardware configuration. (Again, see the earlier post referenced above.) With the default partition mode of none, you always get three buffers, regardless of machine configuration, so I generated a “range table” for max_memory settings between 1 KB and 4096 KB as an example. start_memory_range_kb end_memory_range_kb total_regular_buffers regular_buffer_size total_buffer_size 1 191 NULL NULL NULL 192 383 3 130867 392601 384 575 3 196403 589209 576 767 3 261939 785817 768 959 3 327475 982425 960 1151 3 393011 1179033 1152 1343 3 458547 1375641 1344 1535 3 524083 1572249 1536 1727 3 589619 1768857 1728 1919 3 655155 1965465 1920 2111 3 720691 2162073 2112 2303 3 786227 2358681 2304 2495 3 851763 2555289 2496 2687 3 917299 2751897 2688 2879 3 982835 2948505 2880 3071 3 1048371 3145113 3072 3263 3 1113907 3341721 3264 3455 3 1179443 3538329 3456 3647 3 1244979 3734937 3648 3839 3 1310515 3931545 3840 4031 3 1376051 4128153 4032 4096 3 1441587 4324761 As you can see, there are 21 “steps” within this range and max_memory values below 192 KB fall below the 64K per buffer limit so they generate an error when you attempt to specify them. Max approximates True as memory approaches 64K The upshot of this is that the max_memory option does not imply a contract for the maximum memory that will be used for the session buffers (Those of you who read Take it to the Max (and beyond) know that max_memory is really only referring to the event session buffer memory.) but is more of an estimate of total buffer size to the nearest higher multiple of 64K times the number of buffers you have. The maximum delta between your initial max_memory setting and the true total buffer size occurs right after you break through a 64K boundary, for example if you set max_memory = 576 KB (see the green line in the table), your actual buffer size will be closer to 767 KB in a non-partitioned event session. You get “stepped up” for every 191 KB block of initial max_memory which isn’t likely to cause a problem for most machines. Things get more interesting when you consider a partitioned event session on a computer that has a large number of logical CPUs or NUMA nodes. Since each buffer gets “stepped up” when you break a boundary, the delta can get much larger because it’s multiplied by the number of buffers. For example, a machine with 64 logical CPUs will have 160 buffers using per_cpu partitioning or if you have 8 NUMA nodes configured on that machine you would have 24 buffers when using per_node. If you’ve just broken through a 64K boundary and get “stepped up” to the next buffer size you’ll end up with total buffer size approximately 10240 KB and 1536 KB respectively (64K * # of buffers) larger than max_memory value you might think you’re getting. Using per_cpu partitioning on large machine has the most impact because of the large number of buffers created. If the amount of memory being used by your system within these ranges is important to you then this is something worth paying attention to and considering when you configure your event sessions. The DMV dm_xe_sessions is the tool to use to identify the exact buffer size for your sessions. In addition to the regular buffers (read: event session buffers) you’ll also see the details for large buffers if you have configured MAX_EVENT_SIZE. The “buffer steps” for any given hardware configuration should be static within each partition mode so if you want to have a handy reference available when you configure your event sessions you can use the following code to generate a range table similar to the one above that is applicable for your specific machine and chosen partition mode. DECLARE @buf_size_output table (input_memory_kb bigint, total_regular_buffers bigint, regular_buffer_size bigint, total_buffer_size bigint) DECLARE @buf_size int, @part_mode varchar(8) SET @buf_size = 1 -- Set to the begining of your max_memory range (KB) SET @part_mode = 'per_cpu' -- Set to the partition mode for the table you want to generate WHILE @buf_size <= 4096 -- Set to the end of your max_memory range (KB) BEGIN     BEGIN TRY         IF EXISTS (SELECT * from sys.server_event_sessions WHERE name = 'buffer_size_test')             DROP EVENT SESSION buffer_size_test ON SERVER         DECLARE @session nvarchar(max)         SET @session = 'create event session buffer_size_test on server                         add event sql_statement_completed                         add target ring_buffer                         with (max_memory = ' + CAST(@buf_size as nvarchar(4)) + ' KB, memory_partition_mode = ' + @part_mode + ')'         EXEC sp_executesql @session         SET @session = 'alter event session buffer_size_test on server                         state = start'         EXEC sp_executesql @session         INSERT @buf_size_output (input_memory_kb, total_regular_buffers, regular_buffer_size, total_buffer_size)             SELECT @buf_size, total_regular_buffers, regular_buffer_size, total_buffer_size FROM sys.dm_xe_sessions WHERE name = 'buffer_size_test'     END TRY     BEGIN CATCH         INSERT @buf_size_output (input_memory_kb)             SELECT @buf_size     END CATCH     SET @buf_size = @buf_size + 1 END DROP EVENT SESSION buffer_size_test ON SERVER SELECT MIN(input_memory_kb) start_memory_range_kb, MAX(input_memory_kb) end_memory_range_kb, total_regular_buffers, regular_buffer_size, total_buffer_size from @buf_size_output group by total_regular_buffers, regular_buffer_size, total_buffer_size Thanks to Jonathan for an interesting question and a chance to explore some of the details of Extended Event internals. - Mike

    Read the article

  • What does it mean when a User-Agent has another User-Agent inside it?

    - by Erx_VB.NExT.Coder
    Basically, sometimes the user-agent will have its normal user-agent displayed, then at the end it will have teh "User-Agent: " tag displayed, and right after it another user-agent is shown. Sometimes, the second user-agent is just appended to the first one without the "User-Agent: " tag. Here are some samples I've seen: The first few contain the "User-Agent: " tag in the middle somewhere, and I've changed its font to make it easier to to see. Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; Trident/4.0; GTB6; User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1); SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506) Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6; MRA 5.10 (build 5339); User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1); .NET CLR 1.1.4322; .NET CLR 2.0.50727) Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1); .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1); .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152) Here are some without the "User-Agent: " tag in the middle, but just two user agents that seem stiched together. Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1); .NET CLR 3.5.30729) Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6; IPMS/6568080A-04A5AD839A9; TCO_20090713170733; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1); InfoPath.2) Now, just to add a few notes to this. I understand that the "User-Agent: " tag is normally a header, and what follows a typical "User-Agent: " string sequence is the actual user agent that is sent to servers etc, but normally the "User-Agent: " string should not be part of the actual user agent, that is more like the pre-fix or a tag indicating that what follows will be the actual user agent. Additionally, I may have thought, hey, these are just two user agents pasted together, but on closer inspection, you realize that they are not. On all of these dual user agent listings, if you look at the opening bracket "(" just before the "compatible" keyword, you realize the pair to that bracket ")" is actually at the very end, the end of the second user agent. So, the first user agents closing bracket ")" never occurs before the second user agent begins, it's always right at the end, and therefore, the second user agent is more like one of the features of the first user agent, like: "Trident/4.0" or "GTB6" etc etc... The other thing to note that the second user agent is always MSIE 6.0 (Internet Explorer 6.0), interesting. What I had initially thought was it's some sort of Virtual Machine displaying the browser in use & the browser that is installed, but then I thought, what'd be the point in that? Finally, right now, I am thinking, it's probably soem sort of "Compatibility View" type thing, where even if MSIE 7.0 or 8.0 is installed, when my hypothetical the "Display In Internet Explorer 6.0" mode is turned on, the user agent changes to something like this. That being, IE 8.0 is installed, but is rendering everything as IE 6.0 would. Is there or was there such a feature in Internet Explorer? Am I on to something here? What are your thoughts on this? If you have any other ideas, please feel free to let us know. At the moment, I'm just trying to understand if these are valid User Agents, or if they are invalid. In a list of about 44,000 User Agents, I've seen this type of Dual User Agent about 400 times. I've closely inspected 40 of them, and every single one had MSIE 6.0 as the "second" user agent (and the first user agent a higher version of MSIE, such as 7 or 8). This was true for all except one, where both user agents were MSIE 8.0, here it is: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; Mozilla/4.0 (compatible; MSIE 8.0; Win32; GMX); GTB0.0) This occured once in my 40 "close" inspections. I've estimated the 400 in 44,000 by taking a sample of the first 4,400 user agents, and finding 40 of these in the MSIE/Windows user agents, and extrapolated that to estimate 40. There were also similar things occuring for non MSIE user agents where there were two Mozilla's in one user agent, the non MSIE ones would probably add another 30% on top of the ones I've noted. I can show you samples of them if anyone would like. There we have it, this is where I'm at, what do you guys think?

    Read the article

  • General monitoring for SQL Server Analysis Services using Performance Monitor

    - by Testas
    A recent customer engagement required a setup of a monitoring solution for SSAS, due to the time restrictions placed upon this, native Windows Performance Monitor (Perfmon) and SQL Server Profiler Monitoring Tools was used as using a third party tool would have meant the customer providing an additional monitoring server that was not available.I wanted to outline the performance monitoring counters that was used to monitor the system on which SSAS was running. Due to the slow query performance that was occurring during certain scenarios, perfmon was used to establish if any pressure was being placed on the Disk, CPU or Memory subsystem when concurrent connections access the same query, and Profiler to pinpoint how the query was being managed within SSAS, profiler I will leave for another blogThis guide is not designed to provide a definitive list of what should be used when monitoring SSAS, different situations may require the addition or removal of counters as presented by the situation. However I hope that it serves as a good basis for starting your monitoring of SSAS. I would also like to acknowledge Chris Webb’s awesome chapters from “Expert Cube Development” that also helped shape my monitoring strategy:http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!6657.entrySimulating ConnectionsTo simulate the additional connections to the SSAS server whilst monitoring, I used ascmd to simulate multiple connections to the typical and worse performing queries that were identified by the customer. A similar sript can be downloaded from codeplex at http://www.codeplex.com/SQLSrvAnalysisSrvcs.     File name: ASCMD_StressTestingScripts.zip. Performance MonitorWithin performance monitor,  a counter log was created that contained the list of counters below. The important point to note when running the counter log is that the RUN AS property within the counter log properties should be changed to an account that has rights to the SSAS instance when monitoring MSAS counters. Failure to do so means that the counter log runs under the system account, no errors or warning are given while running the counter log, and it is not until you need to view the MSAS counters that they will not be displayed if run under the default account that has no right to SSAS. If your connection simulation takes hours, this could prove quite frustrating if not done beforehand JThe counters used……  Object Counter Instance Justification System Processor Queue legnth N/A Indicates how many threads are waiting for execution against the processor. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine's processors are able to handle. System Context Switches/sec N/A Measures how frequently the processor has to switch from user- to kernel-mode to handle a request from a thread running in user mode. The heavier the workload running on your machine, the higher this counter will generally be, but over long term the value of this counter should remain fairly constant. If this counter suddenly starts increasing however, it may be an indicating of a malfunctioning device, especially if the Processor\Interrupts/sec\(_Total) counter on your machine shows a similar unexplained increase Process % Processor Time sqlservr Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process % Processor Time msmdsrv Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process Working Set sqlservr If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Process Working Set msmdsrv If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Processor % Processor Time _Total and individual cores measures the total utilization of your processor by all running processes. If multi-proc then be mindful only an average is provided Processor % Privileged Time _Total To see how the OS is handling basic IO requests. If kernel mode utilization is high, your machine is likely underpowered as it's too busy handling basic OS housekeeping functions to be able to effectively run other applications. Processor % User Time _Total To see how the applications is interacting from a processor perspective, a high percentage utilisation determine that the server is dealing with too many apps and may require increasing thje hardware or scaling out Processor Interrupts/sec _Total  The average rate, in incidents per second, at which the processor received and serviced hardware interrupts. Shoulr be consistant over time but a sudden unexplained increase could indicate a device malfunction which can be confirmed using the System\Context Switches/sec counter Memory Pages/sec N/A Indicates the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays, this is the primary counter to watch for indication of possible insufficient RAM to meet your server's needs. A good idea here is to configure a perfmon alert that triggers when the number of pages per second exceeds 50 per paging disk on your system. May also want to see the configuration of the page file on the Server Memory Available Mbytes N/A is the amount of physical memory, in bytes, available to processes running on the computer. if this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM. monitor it regularly to see if any downward trend develops, and set an alert to trigger if it drops below 2% of the installed RAM. Physical Disk Disk Transfers/sec for each physical disk If it goes above 10 disk I/Os per second then you've got poor response time for your disk. Physical Disk Idle Time _total If Disk Transfers/sec is above  25 disk I/Os per second use this counter. which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read/write requests queuing up for your disk which is unable to service these requests in a timely fashion. Physical Disk Disk queue legnth For the OLAP and SQL physical disk A value that is consistently less than 2 means that the disk system is handling the IO requests against the physical disk Network Interface Bytes Total/sec For the NIC Should be monitored over a period of time to see if there is anb increase/decrease in network utilisation Network Interface Current Bandwidth For the NIC is an estimate of the current bandwidth of the network interface in bits per second (BPS). MSAS 2005: Memory Memory Limit High KB N/A Shows (as a percentage) the high memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Limit Low KB N/A Shows (as a percentage) the low memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Usage KB N/A Displays the memory usage of the server process. MSAS 2005: Memory File Store KB N/A Displays the amount of memory that is reserved for the Cache. Note if total memory limit in the msmdsrv.ini is set to 0, no memory is reserved for the cache MSAS 2005: Storage Engine Query Queries from Cache Direct / sec N/A Displays the rate of queries answered from the cache directly MSAS 2005: Storage Engine Query Queries from Cache Filtered / Sec N/A Displays the Rate of queries answered by filtering existing cache entry. MSAS 2005: Storage Engine Query Queries from File / Sec N/A Displays the Rate of queries answered from files. MSAS 2005: Storage Engine Query Average time /query N/A Displays the average time of a query MSAS 2005: Connection Current connections N/A Displays the number of connections against the SSAS instance MSAS 2005: Connection Requests / sec N/A Displays the rate of query requests per second MSAS 2005: Locks Current Lock Waits N/A Displays thhe number of connections waiting on a lock MSAS 2005: Threads Query Pool job queue Length N/A The number of queries in the job queue MSAS 2005:Proc Aggregations Temp file bytes written/sec N/A Shows the number of bytes of data processed in a temporary file MSAS 2005:Proc Aggregations Temp file rows written/sec N/A Shows the number of bytes of data processed in a temporary file 

    Read the article

  • Movement prediction for non-shooters

    - by ShadowChaser
    I'm working on an isometric 2D game with moderate-scale multiplayer, approximately 20-30 players connected at once to a persistent server. I've had some difficulty getting a good movement prediction implementation in place. Physics/Movement The game doesn't have a true physics implementation, but uses the basic principles to implement movement. Rather than continually polling input, state changes (ie/ mouse down/up/move events) are used to change the state of the character entity the player is controlling. The player's direction (ie/ north-east) is combined with a constant speed and turned into a true 3D vector - the entity's velocity. In the main game loop, "Update" is called before "Draw". The update logic triggers a "physics update task" that tracks all entities with a non-zero velocity uses very basic integration to change the entities position. For example: entity.Position += entity.Velocity.Scale(ElapsedTime.Seconds) (where "Seconds" is a floating point value, but the same approach would work for millisecond integer values). The key point is that no interpolation is used for movement - the rudimentary physics engine has no concept of a "previous state" or "current state", only a position and velocity. State Change and Update Packets When the velocity of the character entity the player is controlling changes, a "move avatar" packet is sent to the server containing the entity's action type (stand, walk, run), direction (north-east), and current position. This is different from how 3D first person games work. In a 3D game the velocity (direction) can change frame to frame as the player moves around. Sending every state change would effectively transmit a packet per frame, which would be too expensive. Instead, 3D games seem to ignore state changes and send "state update" packets on a fixed interval - say, every 80-150ms. Since speed and direction updates occur much less frequently in my game, I can get away with sending every state change. Although all of the physics simulations occur at the same speed and are deterministic, latency is still an issue. For that reason, I send out routine position update packets (similar to a 3D game) but much less frequently - right now every 250ms, but I suspect with good prediction I can easily boost it towards 500ms. The biggest problem is that I've now deviated from the norm - all other documentation, guides, and samples online send routine updates and interpolate between the two states. It seems incompatible with my architecture, and I need to come up with a better movement prediction algorithm that is closer to a (very basic) "networked physics" architecture. The server then receives the packet and determines the players speed from it's movement type based on a script (Is the player able to run? Get the player's running speed). Once it has the speed, it combines it with the direction to get a vector - the entity's velocity. Some cheat detection and basic validation occurs, and the entity on the server side is updated with the current velocity, direction, and position. Basic throttling is also performed to prevent players from flooding the server with movement requests. After updating its own entity, the server broadcasts an "avatar position update" packet to all other players within range. The position update packet is used to update the client side physics simulations (world state) of the remote clients and perform prediction and lag compensation. Prediction and Lag Compensation As mentioned above, clients are authoritative for their own position. Except in cases of cheating or anomalies, the client's avatar will never be repositioned by the server. No extrapolation ("move now and correct later") is required for the client's avatar - what the player sees is correct. However, some sort of extrapolation or interpolation is required for all remote entities that are moving. Some sort of prediction and/or lag-compensation is clearly required within the client's local simulation / physics engine. Problems I've been struggling with various algorithms, and have a number of questions and problems: Should I be extrapolating, interpolating, or both? My "gut feeling" is that I should be using pure extrapolation based on velocity. State change is received by the client, client computes a "predicted" velocity that compensates for lag, and the regular physics system does the rest. However, it feels at odds to all other sample code and articles - they all seem to store a number of states and perform interpolation without a physics engine. When a packet arrives, I've tried interpolating the packet's position with the packet's velocity over a fixed time period (say, 200ms). I then take the difference between the interpolated position and the current "error" position to compute a new vector and place that on the entity instead of the velocity that was sent. However, the assumption is that another packet will arrive in that time interval, and it's incredibly difficult to "guess" when the next packet will arrive - especially since they don't all arrive on fixed intervals (ie/ state changes as well). Is the concept fundamentally flawed, or is it correct but needs some fixes / adjustments? What happens when a remote player stops? I can immediately stop the entity, but it will be positioned in the "wrong" spot until it moves again. If I estimate a vector or try to interpolate, I have an issue because I don't store the previous state - the physics engine has no way to say "you need to stop after you reach position X". It simply understands a velocity, nothing more complex. I'm reluctant to add the "packet movement state" information to the entities or physics engine, since it violates basic design principles and bleeds network code across the rest of the game engine. What should happen when entities collide? There are three scenarios - the controlling player collides locally, two entities collide on the server during a position update, or a remote entity update collides on the local client. In all cases I'm uncertain how to handle the collision - aside from cheating, both states are "correct" but at different time periods. In the case of a remote entity it doesn't make sense to draw it walking through a wall, so I perform collision detection on the local client and cause it to "stop". Based on point #2 above, I might compute a "corrected vector" that continually tries to move the entity "through the wall" which will never succeed - the remote avatar is stuck there until the error gets too high and it "snaps" into position. How do games work around this?

    Read the article

  • SQL University: What and why of database testing

    - by Mladen Prajdic
    This is a post for a great idea called SQL University started by Jorge Segarra also famously known as SqlChicken on Twitter. It’s a collection of blog posts on different database related topics contributed by several smart people all over the world. So this week is mine and we’ll be talking about database testing and refactoring. In 3 posts we’ll cover: SQLU part 1 - What and why of database testing SQLU part 2 - What and why of database refactoring SQLU part 2 – Tools of the trade With that out of the way let us sharpen our pencils and get going. Why test a database The sad state of the industry today is that there is very little emphasis on testing in general. Test driven development is still a small niche of the programming world while refactoring is even smaller. The cause of this is the inability of developers to convince themselves and their managers that writing tests is beneficial. At the moment they are mostly viewed as waste of time. This is because the average person (let’s not fool ourselves, we’re all average) is unable to think about lower future costs in relation to little more current work. It’s orders of magnitude easier to know about the current costs in relation to current amount of work. That’s why programmers convince themselves testing is a waste of time. However we have to ask ourselves what tests are really about? Maybe finding bugs? No, not really. If we introduce bugs, we’re likely to write test around those bugs too. But yes we can find some bugs with tests. The main point of tests is to have reproducible repeatability in our systems. By having a code base largely covered by tests we can know with better certainty what a small code change can break in other parts of the system. By having repeatability we can make code changes with confidence, since we know we’ll see what breaks in other tests. And here comes the inability to estimate future costs. By spending just a few more hours writing those tests we’d know instantly what broke where. Imagine we fix a reported bug. We check-in the code, deploy it and the users are happy. Until we get a call 2 weeks later about a certain monthly process has stopped working. What we don’t know is that this process was developed by a long gone coworker and for some reason it relied on that same bug we’ve happily fixed. There’s no way we could’ve known that. We say OK and go in and fix the monthly process. But what we have no clue about is that there’s this ETL job that relied on data from that monthly process. Now that we’ve fixed the process it’s giving unexpected (yet correct since we fixed it) data to the ETL job. So we have to fix that too. But there’s this part of the app we coded that relies on data from that exact ETL job. And just like that we enter the “Loop of maintenance horror”. With the loop eventually comes blame. Here’s a nice tip for all developers and DBAs out there: If you make a mistake man up and admit to it. All of the above is valid for any kind of software development. Keeping this in mind the database is nothing other than just a part of the application. But a big part! One reason why testing a database is even more important than testing an application is that one database is usually accessed from multiple applications and processes. This makes it the central and vital part of the enterprise software infrastructure. Knowing all this can we really afford not to have tests? What to test in a database Now that we’ve decided we’ll dive into this testing thing we have to ask ourselves what needs to be tested? The short answer is: everything. The long answer is: read on! There are 2 main ways of doing tests: Black box and White box testing. Black box testing means we have no idea how the system internals are built and we only have access to it’s inputs and outputs. With it we test that the internal changes to the system haven’t caused the input/output behavior of the system to change. The most important thing to test here are the edge conditions. It’s where most programs break. Having good edge condition tests we can be more confident that the systems changes won’t break. White box testing has the full knowledge of the system internals. With it we test the internal system changes, different states of the application, etc… White and Black box tests should be complementary to each other as they are very much interconnected. Testing database routines includes testing stored procedures, views, user defined functions and anything you use to access the data with. Database routines are your input/output interface to the database system. They count as black box testing. We test then for 2 things: Data and schema. When testing schema we only care about the columns and the data types they’re returning. After all the schema is the contract to the out side systems. If it changes we usually have to change the applications accessing it. One helpful T-SQL command when doing schema tests is SET FMTONLY ON. It tells the SQL Server to return only empty results sets. This speeds up tests because it doesn’t return any data to the client. After we’ve validated the schema we have to test the returned data. There no other way to do this but to have expected data known before the tests executes and comparing that data to the database routine output. Testing Authentication and Authorization helps us validate who has access to the SQL Server box (Authentication) and who has access to certain database objects (Authorization). For desktop applications and windows authentication this works well. But the biggest problem here are web apps. They usually connect to the database as a single user. Please ensure that that user is not SA or an account with admin privileges. That is just bad. Load testing ensures us that our database can handle peak loads. One often overlooked tool for load testing is Microsoft’s OSTRESS tool. It’s part of RML utilities (x86, x64) for SQL Server and can help determine if our database server can handle loads like 100 simultaneous users each doing 10 requests per second. SQL Profiler can also help us here by looking at why certain queries are slow and what to do to fix them.   One particular problem to think about is how to begin testing existing databases. First thing we have to do is to get to know those databases. We can’t test something when we don’t know how it works. To do this we have to talk to the users of the applications accessing the database, run SQL Profiler to see what queries are being run, use existing documentation to decipher all the object relationships, etc… The way to approach this is to choose one part of the database (say a logical grouping of tables that go together) and filter our traces accordingly. Once we’ve done that we move on to the next grouping and so on until we’ve covered the whole database. Then we move on to the next one. Database Testing is a topic that we can spent many hours discussing but let this be a nice intro to the world of database testing. See you in the next post.

    Read the article

  • Refactoring Part 1 : Intuitive Investments

    - by Wes McClure
    Fear, it’s what turns maintaining applications into a nightmare.  Technology moves on, teams move on, someone is left to operate the application, what was green is now perceived brown.  Eventually the business will evolve and changes will need to be made.  The approach to those changes often dictates the long term viability of the application.  Fear of change, lack of passion and a lack of interest in understanding the domain often leads to a paranoia to do anything that doesn’t involve duct tape and bailing twine.  Don’t get me wrong, those have a place in the short term viability of a project but they don’t have a place in the long term.  Add to it “us versus them” in regards to the original team and those that maintain it, internal politics and other factors and you have a recipe for disaster.  This results in code that quickly becomes unmanageable.  Even the most clever of designs will eventually become sub optimal and debt will amount that exponentially makes changes difficult.  This is where refactoring comes in, and it’s something I’m very passionate about.  Refactoring is about improving the process whereby we make change, it’s an exponential investment in the process of change. Without it we will incur exponential complexity that halts productivity. Investments, especially in the long term, require intuition and reflection.  How can we tackle new development effectively via evolving the original design and paying off debt that has been incurred? The longer we wait to ask and answer this question, the more it will cost us.  Small requests don’t warrant big changes, but realizing when changes now will pay off in the long term, and especially in the short term, is valuable. I have done my fair share of maintaining applications and continuously refactoring as needed, but recently I’ve begun work on a project that hasn’t had much debt, if any, paid down in years.  This is the first in a series of blog posts to try to capture the process which is largely driven by intuition of smaller refactorings from other projects. Signs that refactoring could help: Testability How can decreasing test time not pay dividends? One of the first things I found was that a very important piece often takes 30+ minutes to test.  I can only imagine how much time this has cost historically, but more importantly the time it might cost in the coming weeks: I estimate at least 10-20 hours per person!  This is simply unacceptable for almost any situation.  As it turns out, about 6 hours of working with this part of the application and I was able to cut the time down to under 30 seconds!  In less than the lost time of one week, I was able to fix the problem for all future weeks! If we can’t test fast then we can’t change fast, nor with confidence. Code is used by end users and it’s also used by developers, consider your own needs in terms of the code base.  Adding logic to enable/disable features during testing can help decouple parts of an application and lead to massive improvements.  What exactly is so wrong about test code in real code?  Often, these become features for operators and sometimes end users.  If you cannot run an integration test within a test runner in your IDE, it’s time to refactor. Readability Are variables named meaningfully via a ubiquitous language? Is the code segmented functionally or behaviorally so as to minimize the complexity of any one area? Are aspects properly segmented to avoid confusion (security, logging, transactions, translations, dependency management etc) Is the code declarative (what) or imperative (how)?  What matters, not how.  LINQ is a great abstraction of the what, not how, of collection manipulation.  The Reactive framework is a great example of the what, not how, of managing streams of data. Are constants abstracted and named, or are they just inline? Do people constantly bitch about the code/design? If the code is hard to understand, it will be hard to change with confidence.  It’s a large undertaking if the original designers didn’t pay much attention to readability and as such will never be done to “completion.”  Make sure not to go over board, instead use this as you change an application, not in lieu of changes (like with testability). Complexity Simplicity will never be achieved, it’s highly subjective.  That said, a lot of code can be significantly simplified, tidy it up as you go.  Refactoring will often converge upon a simplification step after enough time, keep an eye out for this. Understandability In the process of changing code, one often gains a better understanding of it.  Refactoring code is a good way to learn how it works.  However, it’s usually best in combination with other reasons, in effect killing two birds with one stone.  Often this is done when readability is poor, in which case understandability is usually poor as well.  In the large undertaking we are making with this legacy application, we will be replacing it.  Therefore, understanding all of its features is important and this refactoring technique will come in very handy. Unused code How can deleting things not help? This is a freebie in refactoring, it’s very easy to detect with modern tools, especially in statically typed languages.  We have VCS for a reason, if in doubt, delete it out (ok that was cheesy)! If you don’t know where to start when refactoring, this is an excellent starting point! Duplication Do not pray and sacrifice to the anti-duplication gods, there are excellent examples where consolidated code is a horrible idea, usually with divergent domains.  That said, mediocre developers live by copy/paste.  Other times features converge and aren’t combined.  Tools for finding similar code are great in the example of copy/paste problems.  Knowledge of the domain helps identify convergent concepts that often lead to convergent solutions and will give intuition for where to look for conceptual repetition. 80/20 and the Boy Scouts It’s often said that 80% of the time 20% of the application is used most.  These tend to be the parts that are changed.  There are also parts of the code where 80% of the time is spent changing 20% (probably for all the refactoring smells above).  I focus on these areas any time I make a change and follow the philosophy of the Boy Scout in cleaning up more than I messed up.  If I spend 2 hours changing an application, in the 20%, I’ll always spend at least 15 minutes cleaning it or nearby areas. This gives a huge productivity edge on developers that don’t. Ironically after a short period of time the 20% shrinks enough that we don’t have to spend 80% of our time there and can move on to other areas.   Refactoring is highly subjective, never attempt to refactor to completion!  Learn to be comfortable with leaving one part of the application in a better state than others.  It’s an evolution, not a revolution.  These are some simple areas to look into when making changes and can help get one started in the process.  I’ve often found that refactoring is a convergent process towards simplicity that sometimes spans a few hours but often can lead to massive simplifications over the timespan of weeks and months of regular development.

    Read the article

  • When is a Seek not a Seek?

    - by Paul White
    The following script creates a single-column clustered table containing the integers from 1 to 1,000 inclusive. IF OBJECT_ID(N'tempdb..#Test', N'U') IS NOT NULL DROP TABLE #Test ; GO CREATE TABLE #Test ( id INTEGER PRIMARY KEY CLUSTERED ); ; INSERT #Test (id) SELECT V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 1000 ; Let’s say we need to find the rows with values from 100 to 170, excluding any values that divide exactly by 10.  One way to write that query would be: SELECT T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; That query produces a pretty efficient-looking query plan: Knowing that the source column is defined as an INTEGER, we could also express the query this way: SELECT T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; We get a similar-looking plan: If you look closely, you might notice that the line connecting the two icons is a little thinner than before.  The first query is estimated to produce 61.9167 rows – very close to the 63 rows we know the query will return.  The second query presents a tougher challenge for SQL Server because it doesn’t know how to predict the selectivity of the modulo expression (T.id % 10 > 0).  Without that last line, the second query is estimated to produce 68.1667 rows – a slight overestimate.  Adding the opaque modulo expression results in SQL Server guessing at the selectivity.  As you may know, the selectivity guess for a greater-than operation is 30%, so the final estimate is 30% of 68.1667, which comes to 20.45 rows. The second difference is that the Clustered Index Seek is costed at 99% of the estimated total for the statement.  For some reason, the final SELECT operator is assigned a small cost of 0.0000484 units; I have absolutely no idea why this is so, or what it models.  Nevertheless, we can compare the total cost for both queries: the first one comes in at 0.0033501 units, and the second at 0.0034054.  The important point is that the second query is costed very slightly higher than the first, even though it is expected to produce many fewer rows (20.45 versus 61.9167). If you run the two queries, they produce exactly the same results, and both complete so quickly that it is impossible to measure CPU usage for a single execution.  We can, however, compare the I/O statistics for a single run by running the queries with STATISTICS IO ON: Table '#Test'. Scan count 63, logical reads 126, physical reads 0. Table '#Test'. Scan count 01, logical reads 002, physical reads 0. The query with the IN list uses 126 logical reads (and has a ‘scan count’ of 63), while the second query form completes with just 2 logical reads (and a ‘scan count’ of 1).  It is no coincidence that 126 = 63 * 2, by the way.  It is almost as if the first query is doing 63 seeks, compared to one for the second query. In fact, that is exactly what it is doing.  There is no indication of this in the graphical plan, or the tool-tip that appears when you hover your mouse over the Clustered Index Seek icon.  To see the 63 seek operations, you have click on the Seek icon and look in the Properties window (press F4, or right-click and choose from the menu): The Seek Predicates list shows a total of 63 seek operations – one for each of the values from the IN list contained in the first query.  I have expanded the first seek node to show the details; it is seeking down the clustered index to find the entry with the value 101.  Each of the other 62 nodes expands similarly, and the same information is contained (even more verbosely) in the XML form of the plan. Each of the 63 seek operations starts at the root of the clustered index B-tree and navigates down to the leaf page that contains the sought key value.  Our table is just large enough to need a separate root page, so each seek incurs 2 logical reads (one for the root, and one for the leaf).  We can see the index depth using the INDEXPROPERTY function, or by using the a DMV: SELECT S.index_type_desc, S.index_depth FROM sys.dm_db_index_physical_stats ( DB_ID(N'tempdb'), OBJECT_ID(N'tempdb..#Test', N'U'), 1, 1, DEFAULT ) AS S ; Let’s look now at the Properties window when the Clustered Index Seek from the second query is selected: There is just one seek operation, which starts at the root of the index and navigates the B-tree looking for the first key that matches the Start range condition (id >= 101).  It then continues to read records at the leaf level of the index (following links between leaf-level pages if necessary) until it finds a row that does not meet the End range condition (id <= 169).  Every row that meets the seek range condition is also tested against the Residual Predicate highlighted above (id % 10 > 0), and is only returned if it matches that as well. You will not be surprised that the single seek (with a range scan and residual predicate) is much more efficient than 63 singleton seeks.  It is not 63 times more efficient (as the logical reads comparison would suggest), but it is around three times faster.  Let’s run both query forms 10,000 times and measure the elapsed time: DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON; SET STATISTICS XML OFF; ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id IN ( 101,102,103,104,105,106,107,108,109, 111,112,113,114,115,116,117,118,119, 121,122,123,124,125,126,127,128,129, 131,132,133,134,135,136,137,138,139, 141,142,143,144,145,146,147,148,149, 151,152,153,154,155,156,157,158,159, 161,162,163,164,165,166,167,168,169 ) ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; GO DECLARE @i INTEGER, @n INTEGER = 10000, @s DATETIME = GETDATE() ; SET NOCOUNT ON ; WHILE @n > 0 BEGIN SELECT @i = T.id FROM #Test AS T WHERE T.id >= 101 AND T.id <= 169 AND T.id % 10 > 0 ; SET @n -= 1; END ; PRINT DATEDIFF(MILLISECOND, @s, GETDATE()) ; On my laptop, running SQL Server 2008 build 4272 (SP2 CU2), the IN form of the query takes around 830ms and the range query about 300ms.  The main point of this post is not performance, however – it is meant as an introduction to the next few parts in this mini-series that will continue to explore scans and seeks in detail. When is a seek not a seek?  When it is 63 seeks © Paul White 2011 email: [email protected] twitter: @SQL_kiwi

    Read the article

  • Oracle Partner Store (OPS) New Enhancements

    - by Kristin Rose
    Effective June 29th, Oracle Partner Store (OPS) will release the enhancements listed below to improve your overall ordering experience. v Online Transactional Oracle Master Agreement (Online TOMA) The Online TOMA enables end users to execute a transactional end user license agreement with Oracle. The new Online TOMA in OPS will replace the need for you to obtain a signed hard copy of the TOMA from the end user. You will now initiate the Online TOMA via OPS. Navigation: OPS Home > Order Tools > Online TOMA Query > Request Online TOMA> End User Contact, click “Select for TOMA” > Select Language > Submit (an automated email is sent immediately to the requestor and the end user) Ø The Online TOMA can also be initiated from the ‘My OPS’ tab. Under the Online TOMA Query section partners can track Online TOMA request details submitted to end users. The status of the Online TOMA request and the OMA Key generated (once Ts&Cs of the Online TOMA are accepted by an end user) are also displayed in this table. There is also the ability to resend pending Online TOMA requests by clicking ‘Resend’. Navigation: OPS Home > Order Tools > Online TOMA Query For more details on the Transactional OMA, please click here. v Convert Deals to Carts The partner deal registration system within OPS will now allow you to convert approved deals into carts with a simple click of a button. VADs can use Deal to Cart on all of their partners' registrations, regardless of whether they submitted on their partner's behalf, or the partner submitted themselves. Navigation: Login > Deal Registrations > Deal Registration List > Open the approved deal > Click Deal Reg ID number link to open > Click on 'Create Cart' link You can locate your newly created cart in the Saved Carts section of OPS. Links are also available from within an open deal or from the Deal Registration List. Click on the cart number to proceed. v Partner Opportunity Management: Deal Registration on OPS now allows you to see updated information on your opportunities from Oracle’s Fusion CRM opportunity management system.  Key fields such as close date, sales stage, products and status can be viewed by clicking the opportunity ID associated with the deal registration.  This new feature allows you to see regular updates to your opportunities after registrations are approved.  Through ongoing communication with Oracle Channel Managers and Sales Reps, you can ensure that Oracle has the latest information on your active registered deals. v Product Recommendations: When adding products to the Deal Registrations tab, OPS will now show additional products that you can try to include to maximize your sale and rebate. v Advanced Customer Support(ACS) Services Note: This will be available from July 9th. Initiate the purchase of the complete stack (HW/SW/Services) online with one single OPS order. More ACS services now supported online with exception of Start-Up Pack: · New SW installation services for Standard Configurations & stand alone System Software. · New Pre-production & Go-live services for Standard & Engineered Systems · New SW configuration & Platinum Pre-Production & Go-Live services for Engineered Systems · New Travel & Expenses Estimate included · New Partner & VAD volume discount supported v Software as a Service (SaaS) for Independent Software Vendors (ISVs): Oracle SaaS ISVs can now use OPS to submit their monthly usage reports to Oracle within 20 days after the end of every month. Navigation: OPS Home > Cart > Transaction Type: Partner SaaS for ISV’s > Add Eligible Products > Check out v Existing Approvals: In an effort to reduce the processing time of discount approvals, we have added a new section in the Request Approval page for you to communicate pre-existing approvals without having to attach the DAT. Just enter the Approval ID and submit your request. In case of existing software approvals, you will be required to submit the DAT with the Contact Information section filled out. v Additional data for Shipping Box Labels and Packing Slips OPS now has additional fields in the Shipping Notes section for you to add PO details. This will help you easily identify shipments as they arrive. Partners will have an End User PO field, whereas VADs will have VAR and End User PO fields. v Shipping Notes on OPS Hardware delivery Shipping Notes will now have multiple options to better suit your requirements. v Reminders for Royalty Reporting Partners: If you have not submitted your royalty report online, OPS will now send an automated alert to remind you. v Order Tracker Changes: · Order Tracker will now have a deal reg flag (Yes/No). You can now clearly distinguish between orders that have registered opportunities. · All lines of the order will be visible in the order details list. v Changes in Terminology · You will notice textual changes on some of our labels and messages relating to approval requests. “Discount Requests” has been replaced with “Approval Requests” to cater to some of our other offerings. · First Line Support (FLS) transaction type has been renamed to Support Provider Partner (SPP). OPS Support For more details on these enhancements, please request a training here. For assistance on the Oracle Partner Store, please contact the OPS support team in your region. NAMER: [email protected] LAD: [email protected] EMEA : [email protected] APAC: [email protected] Japan: [email protected] You can even call us on our Hotline! Find your local number here.     Thank you, Oracle Partner Store Support Team      

    Read the article

  • PASS: The Budget Process

    - by Bill Graziano
    Every fiscal year PASS creates a detailed budget.  This helps us set priorities and communicate to our members what we’re going to do in the upcoming year.  You can review the current budget on the PASS Governance page.  That page currently requires you to login but I’m talking with HQ to see if there are any legal issues with opening that up. The Accounting Team The PASS accounting team is two people.  The Executive Vice-President of Finance (“EVP”) and the PASS Accounting Manager.  Sandy Cherry is the accounting manager and works at PASS HQ.  Sandy has been with PASS since we switched management companies in 2007.  Throughout this document when I talk about any actual work related to the budget that’s all Sandy :)  She’s the glue that gets us through this process.  Last year we went through 32 iterations of the budget before the Board approved so it’s a pretty busy time for her us – well, mostly her. Fiscal Year The PASS fiscal year runs from July 1st through June 30th the following year.  Right now we’re in fiscal year 2011.  Our 2010 Summit actually occurred in FY2011.  We switched to this schedule from a calendar year in 2006.  Our goal was to have the Summit occur early in our fiscal year.  That gives us the rest of the year to handle any significant financial impact from the Summit.  If registrations are down we can reduce spending.  If registrations are up we can decide how much to increase our reserves and how much to spend.  Keep in mind that the Summit is budgeted to generate 82% of our revenue this year.  How it performs has a significant impact on our financials.  The other benefit of this fiscal year is that it matches the Microsoft fiscal year.  We sign an annual sponsorship agreement with Microsoft and it’s very helpful that our fiscal years match. This year our budget process will probably start in earnest in March or April.  I’d like to be done in early June so we can publish before July 1st.  I was late publishing it this year and I’m trying not to repeat that. Our Budget Our actual budget is an Excel spreadsheet with 36 sheets.  We remove some of those when we publish it since they include salary information.  The budget is broken up into various portfolios or departments.  We have 20 portfolios.  They include chapters, marketing, virtual chapters, marketing, etc.  Ideally each portfolio is assigned to a Board member.  Each portfolio also typically has a staff person assigned to it.  Portfolios that aren’t assigned to a Board member are monitored by HQ and the ExecVP-Finance (me).  These are typically smaller portfolios such as deferred membership or Summit futures.  (More on those in a later post.)  All portfolios are reviewed by all Board members during the budget approval process, when interim financials are released internally and at year-end. The Process Our first step is to budget revenues.  The Board determines a target attendee number.  We have formulas based on historical performance that convert that to an overall attendee revenue number.  Other revenue projections (such as vendor sponsorships) come from different parts of the organization.  I hope to have another post with more details on how we project revenues. The next step is to budget expenses.  Board members fill out a sample spreadsheet with their budget for the year.  They can add line items and notes describing what the amounts are for.  Each Board portfolio typically has from 10 to 30 line items.  Any new initiatives they want to pursue needs to be budgeted.  The Summit operations budget is managed by HQ.  It includes the cost for food, electrical, internet, etc.  Most of these come from our estimate of attendees and our contract with the convention center.  During this process the Board can ask for more or less to be spent on various line items.  For example, if we weren’t happy with the Internet at the last Summit we can ask them to look into different options and/or increasing the budget.  HQ will also make adjustments to these numbers based on what they see at the events and the feedback we receive on the surveys. After we have all the initial estimates we start reviewing the entire budget.  It is sent out to the Board and we can see what each portfolio requested and what the overall profit and loss number is.  We usually start with too much in expenses and need to cut.  In years past the Board started haggling over these numbers as a group.  This past year they decided I should take a first cut and present them with a reasonable budget and a list of what I changed.  That worked well and I think we’ll continue to do that in the future. We go through a number of iterations on the budget.  If I remember correctly, we went through 32 iterations before we passed the budget.  At each iteration various revenue and expense numbers can change.  Keep in mind that the PASS budget has 200+ line items spread over 20 portfolios.  Many of these depend on other numbers.  For example, if we decide increase the projected attendees that cascades through our budget.  At each iteration we list what changed and the impact.  Ideally these discussions will take place at a face-to-face Board meeting.  Many of them also take place over the phone.  Board members explain any increase they are asking for while performing due diligence on other budget requests.  Eventually a budget emerges and is passed. Publishing After the budget is passed we create a version without the formulas and salaries for posting on the web site.  Sandy also creates some charts to help our members understand the budget.  The EVP writes a nice little letter describing some of the changes from last year’s budget.  You can see my letter and our budget on the PASS Governance page. And then, eight months later, we start all over again.

    Read the article

  • Webcast Q&A: Qualcomm Provides a Seamless Experience for Customers with Oracle WebCenter

    - by kellsey.ruppel
    Last Thursday we had the second webcast in our WebCenter in Action webcast series, "Qualcomm Provides a Seamless Experience for Customers with Oracle WebCenter, where customer Michael Chander from Qualcomm and Vince Casarez & Gourav Goyal from Oracle Partner Keste shared how Oracle WebCenter is powering Qualcomm’s externally facing website and providing a seamless experience for their customers. In case you missed it, here's a recap of the Q&A.   Mike Chandler, Qualcomm Q: Did you run into any issues when integrating all of the different applications together?A: Definitely, our main challenges were in the area of user provisioning and security propagation, all the standard stuff you might expect when hooking up SSO for authentication and authorization. In addition, we spent several iterations getting the UI’s in sync. While everyone was given the same digital material to build too, each team interpreted and implemented it their own way. Initially as a user navigated, if you were looking for it, you could slight variations in color or font or width , stuff like that. So we had to pull all the developers responsible for the UI together and get pixel level agreement on a lot of things so we could ensure seamless transitions across applications. Q: What has been the biggest benefit your end users have seen?A: Wow, there have been several. An SSO enabled environment was huge a win for our users. The portal application that this replaced had not really been invested in by the business. With this project, we had full business participation and backing, and it really showed in some key areas like the shopping experience. For example, while ordering in the previous site, the items did not have any pictures or really usable descriptions. A tremendous amount of work was done to try and make the site more intuitive and user friendly. Site performance has also drastically improved thanks to new hardware, improved database design, and of course the fact that ADF has made great strides in runtime performance. Q: Was there any resistance internally when implementing the solution? If so, how did you overcome that?A: Within a large company, I’m sure there is always going to be competition for large projects, as there was here. Once we got through the technical analysis and settled on the technology choices, it was actually no resistance to implementing the solution. This project was fully driven by the business with the aim of long term growth. I can confidently say that the fact that this project was given the utmost importance by both the business and IT really help put down any resistance that you would typically see while implementing a new solution. Q: Given the performance, what do you estimate to be the top end capacity of the system? A:I think our top end capacity is really only limited by our hardware. I’m comfortable saying we could grow 10x on our current hardware, both in terms of transactions and users. We can easily spin up new JVM instances if needed. We already use less JVM’s than we had planned. In addition, ADF is doing a very good job with his connection pooling and application module pooling, so we see a very good ratio of users connected to the systems vs db connections, without impacting performace. Q: What's the overview or summary of feedback from the users interacting with the site?A: Feedback has been overwhelmingly positive from both the business and our customers. They’re very happy with the new SSO environment , the new LAF, and the performance of the site. Of course, it’s not all roses. No matter what, there are always going to be people that don’t like the layout or the color scheme, etc. By and large though, customers are happy and the business is happy. Q: Can you describe the impressions about the site before and after the project within Qualcomm?A: Before the project, the site worked and people were using it, but most people were not happy with it. It was slow and tended to be a bit tempermental, for example a user would perform a transaction and the system would throw and unexpected error. The user could back up and retry the steps and things would work fine, so why didn’t work the first time?. From a UI perspective, we’d hear comments like it looked like it was built by a high school student.  Vince Casarez & Gourav Goyal, Keste Q: Did you run into any obstacles when implementing the solution?A: It's interesting some people call them "obstacles" on this project we just called them "dependencies".  There were both technical and business related dependencies that we had to work out. Mike points out the SSO dependencies and the coordination and synchronization between the teams to have a seamless login experience and a seamless end user experience.  There was also a set of dependencies on the User Acceptance testing to make sure that everyone understood the use cases for how the system would be used.  With a branching into a new market and trying to match a simple user experience as many consumer sites have today, there was always a tendency for the team members to provide their suggestions on how things could be simpler.  But with all the work up front on the user design and getting the business driving this set of experiences, this minimized the downstream suggestions that tend to distract a team.  In this case, all the work up front allowed us to enumerate the "dependencies" and keep the distractions to a minimum. Q: Was there a lot of custom work that needed to be done for this particular solution?A: The focus for this particular solution was really on the custom processes. The interesting thing is that with the data flows and the integration with applications, there are some pre-built integrations, but realistically for the process flow, we had to build those. The framework and tooling we used made things easier so we didn’t have to implement core functionality, like transitioning from screen to screen or from flow to flow. The design feature of Task Flows really helped speed the development and keep the component infrastructure in line with the dynamic processes.  Task flows and other elements like Skins are core to the infrastructure or technology stack of Oracle. This then allowed the team to center the project focus around the business flows and use cases to meet the core requirements and keep the project on time. Q: What do you think were the keys to success for rolling out WebCenter?A:  The 5 main keys to success were: 1) Sponsorship from the whole organization around this project from senior executive agreement, business owners driving functionality, and IT development alignment; 2) Upfront design planning and use case definition to clearly define the project scope and requirements; 3) Focussed development and project management aligned with the top level goals and drivers; 4) User acceptance and usability testing along the way to identify potential issues and direct resolution of the issues;  and 5) Constant prioritization of the issues for development to fix by the business.  It also helps to have great team chemistry and really smart people working on the project. If you missed the webcast, be sure to catch the replay to see a live demonstration of WebCenter in action!  Qualcomm Provides a Seamless Experience for Customers with Oracle WebCenter from Oracle WebCenter

    Read the article

< Previous Page | 14 15 16 17 18 19 20  | Next Page >