code analysis - Page 497

Synchronizing Asynchronous request handlers in Silverlight environment

- by Eric Lifka

For our senior design project my group is making a Silverlight application that utilizes graph theory concepts and stores the data in a database on the back end. We have a situation where we add a link between two nodes in the graph and upon doing so we run analysis to re-categorize our clusters of nodes. The problem is that this re-categorization is quite complex and involves multiple queries and updates to the database so if multiple instances of it run at once it quickly garbles data and breaks (by trying to re-insert already used primary keys). Essentially it's not thread safe, and we're trying to make it safe, and that's where we're failing and need help :). The create link function looks like this: private Semaphore dblock = new Semaphore(1, 1); // This function is on our service reference and gets called // by the client code. public int addNeed(int nodeOne, int nodeTwo) { dblock.WaitOne(); submitNewNeed(createNewNeed(nodeOne, nodeTwo)); verifyClusters(nodeOne, nodeTwo); dblock.Release(); return 0; } private void verifyClusters(int nodeOne, int nodeTwo) { // Run analysis of nodeOne and nodeTwo in graph } All copies of addNeed should wait for the first one that comes in to finish before another can execute. But instead they all seem to be running and conflicting with each other in the verifyClusters method. One solution would be to force our front end calls to be made synchronously. And in fact, when we do that everything works fine, so the code logic isn't broken. But when it's launched our application will be deployed within a business setting and used by internal IT staff (or at least that's the plan) so we'll have the same problem. We can't force all clients to submit data at different times, so we really need to get it synchronized on the back end. Thanks for any help you can give, I'd be glad to supply any additional information that you could need!

Read the article

OCR: How to improve accuracy - existing libraries for removing non-text 'furniture', shapes, etc to

- by Rob

I want to remove rectangles etc that enclose text in a screenshot image, so that I can perform optical character recognition to get accurate text from the screenshot. Background: I doing this to extract data from a legacy application for use with other applications. This is the only way to get at this data as associated files are in a closed, proprietary, binary format. I will be using AutoItScript to drive the application to show data in its UI, then I will screenshot this and feed this to tesseract. I've already had some success in automating the UI, and have been able to use tesseract to get plain ascii text out of the bitmap. There are several AutoItScripr forum articles discussing its use with tesseract/OCR but not specifically for my question. http://www.autoitscript.com/forum/index.php?s=6c32c3ece12756e635a619cdf175eff9&showforum=2 What I need to do There are thin, 1-pixel wide rectangles that closely enclose some text, when fed to tesseract, it sees them as I for example for a verticle line of the rectangle. Any thoughts on how to remove the rectangles, or best practices? I'm asking if there is a generic command line based toolset to overwrite rectangles, for example, in .png files. I could then pass the .png through this, then pass it to tesseract. Details on the tesseract release/setup I've used are as follows: Go here: http://code.google.com/p/tesseract-ocr/downloads/list - For the basic english generic character set to get Tesseract up and running and recognising your bitmapped text into ascii text, use tesseract-2.00.eng.tar.gz (current version at time of writing is: "English language data for Tesseract (2.00 and up) Jul 2007 989 KB 84845") Related questions I have already looked at on Stack Overflow http://stackoverflow.com/questions/1335581/how-to-give-best-chance-of-success-to-an-ocr-software http://stackoverflow.com/questions/2296568/analysis-and-transformation-of-the-image-on-the-basis-of-this-analysis-for-better http://stackoverflow.com/questions/2268028/reading-characters-off-of-the-screen In these, my question is not completely answered or a commercial solution is being sold. I do not want to consider a commercial solution at this stage.

Read the article

Text mining on large database (data mining)

- by yox

Hello, I have a large database of resumes (CV), and a certain table skills grouping all users skills. inside that table there's a field skill_text that describes the skill in full text. I'm looking for an algorithm/software/method to extract significant terms/phrases from that table in order to build a new table with standarized skills.. Here are some examples skills extracted from the DB : Sectoral and competitive analysis Business Development (incl. in international settings) Specific structure and road design software - Microstation, Macao, AutoCAD (basic knowledge) Creative work (Photoshop, In-Design, Illustrator) checking and reporting back on campaign progress organising and attending events and exhibitions Development : Aptana Studio, PHP, HTML, CSS, JavaScript, SQL, AJAX Discipline: One to one marketing, E-marketing (SEO & SEA, display, emailing, affiliate program) Mix marketing, Viral Marketing, Social network marketing. The output shoud be something like : Sectoral and competitive analysis Business Development Specific structure and road design software - Macao AutoCAD Photoshop In-Design Illustrator organising events Development Aptana Studio PHP HTML CSS JavaScript SQL AJAX Mix marketing Viral Marketing Social network marketing emailing SEO One to one marketing As you see only skills remains no other representation text. I know this is possible using text mining technics but how to do it ? the database is realy large.. it's a good thing because we can calculate text frequency and decide if it's a real skill or just meaningless text... The big problem is .. how to determin that "blablabla" is a skill ? thanks

Read the article

Agile and Scrum burning me down please help me figuring out the truth

- by jadook

hi all, in the last while I installed MS-TFS 2008 then started to get myself prepared to use Agile Process Guidance template shipped with the TFS. with little googling I passed through Mike Cohn materials: I watched his conference in youtube "sponsored by google: http://www.youtube.com/watch?v=fb9Rzyi8b90 http://www.youtube.com/watch?v=jeT0pOVg0EI Read his book "Agile Estimating and Planning" Watching the video series in his website: http://www.mountaingoatsoftware.com/presentations-tag/video-recorded I was very happy while absorbing and eating the techniques he is using with the teams and how agile and scrum is such a great software process/methodology until I saw Mike answering a question regarding an architect role and talking about the requirements document... at that point everything start falling apart due to the following: Last year I had been assigned to make full analysis "including requirements gathering" for big project "very high priority project". within 2 months of hardwork, dedication and commitment I delivered the whole analysis with full satisfaction of the customer and my BOSS and ZERO amendments. Later on, the project entered the architecting, development ... phases. due to the fact that the system included many competitive and exciting features I requested patenting it and its going in the process... so imagine you are the kind of person who used to love facing all kind of challenges and returning with excellent experience and results for the stakeholders and yourself, How fairly agile and scrum processes will credit and admit your talent and passion while the scrum master/coach treat the team as one unit that accomplish user stories and converge through trial and error approach??!!!! with that dark thoughts about agile and scrum I found many people "anti agile" and on top of them is "Crispin Rogers Johnson": http://agile-crispin.blogspot.com/ that guy made anti statement for everything Mike Cohn used to talk about. I really don't know what to do next! so any guidance will be appreciated. Thanks,

Read the article

rails + compass: advantages vs using haml + blueprint directly

- by egarcia

I've got some experience using haml (+sass) on rails projects. I recently started using them with blueprintcss - the only thing I did was transform blueprint.css into a sass file, and started coding from there. I even have a rails generator that includes all this by default. It seems that Compass does what I do, and other things. I'm trying to understand what those other things are - but the documentation/tutorials weren't very clear. These are my conclusions: Compass comes with built-in sass mixins that implement common CSS idioms, such as links with icons or horizontal lists. My solution doesn't provide anything like that. (1 point for Compass). Compass has several command-line options: you can create a rails project, but you can also "install" it on an existing rails project. A rails generator could be personalized to do the same thing, I guess. (Tie). Compass has two modes of working with blueprint: "basic" and "semantic" usage. I'm not clear about the differences between those. With my rails generator I only have one mode, but it seems enough. (Tie) Apparently, Compass is prepared to use other frameworks, besides blueprint (e.g. YUI). I could not find much documentation about this, and I'm not interested on it anyway - blueprint is ok for me (Tie). Compass' learning curve seems a bit stiff and the documentation seems sparse. Learning could be a bit difficult. On the other hand, I know the ins and outs of my own system and can use it right away. (1 point for my system). With this analysis, I'm hesitant to give Compass a try. Is my analysis correct? Are Am I missing any key points, or have I evaluated any of these points wrongly?

Read the article

structured vs. unstructured data in db

- by Igor

the question is one of design. i'm gathering a big chunk of performance data with lots of key-value pairs. pretty much everything in /proc/cpuinfo, /proc/meminfo/, /proc/loadavg, plus a bunch of other stuff, from several hundred hosts. right now, i just need to display the latest chunk of data in my UI. i will probably end up doing some analysis of the data gathered to figure out performance problems down the road, but this is a new application so i'm not sure what exactly i'm looking for performance-wise just yet. i could structure the data in the db -- have a column for each key i'm gathering. the table would end up being O(100) columns wide, it would be a pain to put into the db, i would have to add new columns if i start gathering a new stat. but it would be easy to sort/analyze the data just using SQL. or i could just dump my unstructured data blob into the table. maybe three columns -- host id, timestamp, and a serialized version of my array, probably using JSON in a TEXT field. which should I do? am i going to be sorry if i go with the unstructured approach? when doing analysis, should i just convert the fields i'm interested in and create a new, more structured table? what are the trade-offs i'm missing here?

Read the article

Exponential regression : p-value and F significance

- by Saravanan K

I am new to statistics. I have a set of independent data and dependent data (X,Y), where I would like to do an exponential regression to obtain its p-value and significant F (already obtained R2 and also the coefficients through mathematical calculation). What is the natural evolution from the (X,Y) data to mathematically calculate those variables. Spent a week on the internet to study this but unable to find the right answer. Often an exponential data, y=be^(mx) will be converted first to a linear data, ln y = mx + ln b . Then a linear regression will done on the converted data, obtaining its p-value etc. Assume we use a statistical tool such as Excel's Analysis ToolPak: Data Analysis : Regression, it will produce a result such as below, I believe the p-value and Significant F value is representing the converted linear data and not the original exponential data. Questions: What is the approach/steps used by Excel to get the p-value and Significant F value for the converted linear data as shown in the statistic output in the image above? It is not clear in their help page or website. Can the p-value and Significant F could be mathematically calculated for exponential regression without using a statistical tool? Can you assist to point me to the right link if this has been answered before.

Read the article

What is the difference between cubes and the Unified Dimensional Model (if any)?

- by ngm

I'm currently researching SQL Server 2008 as a business intelligence solution, and currently looking at Analysis Services (and I'm pretty new to business intelligence as a whole...) I'm a bit confused by some of the terms in SSAS, particularly the conceptual differences between cubes and MS's Unified Dimensional Model. I believe that a cube in SSAS is basically an OLAP cube -- dimensions, measures, something that sits between the underlying data source and a business user. But then that's kind of what I understand UDM to be as well. The docs for SQL Server 2005 seem to suggest as much: "A cube is essentially synonymous with a Unified Dimensional Model (UDM)". But then the SQL Server 2008 pages sort of suggest that UDM is a wrapper for both multidimensional data (cubes) and relational data: "Use the Unified Dimensional Model to provide one consolidated business view for relational and multidimensional data that includes business entities, business logic, calculations, and metrics." This blog post suggests similarly: "UDM provides a single dimensional model for all OLAP analysis and relational reporting needs. So you can use either MDX or SQL" Is UDM something that sits above cubes? Or are they the same thing? I presume I would develop cubes with the Cube Designer application; what would I develop a UDM with?

Read the article

Scripts to parse and download iTunes Connect and AppStore data

- by bradhouse

I'm looking for recommendations of a script or series of scripts that download and parse iTunes Connect sales data and AppStore comments, ratings and rankings data for a defined app. I'm also aware of solutions like: AppViz appsales-mobile iphone-stats Heartbeat.app I'm sure I'll find a few more with more searching. I can't help but feel there must be a really decent set of open source scripts out there to do this, given how many developers are now writing apps for the AppStore. Would be interested to hear any commercial offerings as well (although my personal preference is for open source, so I can at least see what it is doing with my iTunes Connect login credentials). To be clear, I'm really looking for something that hits all of the areas mentioned: App Store (per store) Comments Ratings Category/store rankings iTunes Connect The contents of the sales reports Analysis/graphs of the data is not necessary (but would be a nice to have I guess). I'm not really looking for something like AppSales Mobile above, I would like the raw data so I can do my own analysis and formatting. So far it looks like AppViz (listed above) is the best out there. Any suggestions on what is good/available or should I just go roll my own?

Read the article

deadlock because of foreign key?

- by George2

Hello everyone, I am using SQL Server 2008 Enterprise. I met with deadlock in the following store procedure, but because of my fault, I did not record the deadlock graph. But now I can not reproduce deadlock issue. I want to have a postmortem to find the root cause of deadlock to avoid deadlock in the future. The deadlock happens on delete statement. For the delete statement, Param1 is a column of table FooTable, Param1 is a foreign key of another table (refers to another primary key clustered index column of the other table). There is no index on Param1 itself for table FooTable. FooTable has another column which is used as clustered primary key, but not Param1 column. Here is my guess why there is deadlock, and I want to let people review whether my analysis is correct? Since Param1 column has no index, there will be a table scan, and will acquire table level lock, because of foreign key, the delete operation will also need to check master table (e.g. to acquire lock on master table); Some operation on master table acquires master table lock, but want to acquire lock on FooTable; (1) and (2) cause cycle lock which makes deadlock happen. My analysis correct? Any reproduce scenario? create PROCEDURE [dbo].[FooProc] ( @Param1 int ,@Param2 int ,@Param3 int ) AS DELETE FooTable WHERE Param1 = @Param1 INSERT INTO FooTable ( Param1 ,Param2 ,Param3 ) VALUES ( @Param1 ,@Param2 ,@Param3 ) DECLARE @ID bigint SET @ID = ISNULL(@@Identity,-1) IF @ID > 0 BEGIN SELECT IdentityStr FROM FooTable WHERE ID = @ID END thanks in advance, George

Read the article

Getting started with massive data

- by Max

I'm a math guy and occasionally do some statistics/machine learning analysis consulting projects on the side. The data I have access to are usually on the smaller side, at most a couple hundred of megabytes (and almost always far less), but I want to learn more about handling and analyzing data on the gigabyte/terabyte scale. What do I need to know and what are some good resources to learn from? Hadoop/MapReduce is one obvious start. Is there a particular programming language I should pick up? (I primarily work now in Python, Ruby, R, and occasionally Java, but it seems like C and Clojure are often used for large-scale data analysis?) I'm not really familiar with the whole NoSQL movement, except that it's associated with big data. What's a good place to learn about it, and is there a particular implementation (Cassandra, CouchDB, etc.) I should get familiar with? Where can I learn about applying machine learning algorithms to huge amounts of data? My math background is mostly on the theory side, definitely not on the numerical or approximation side, and I'm guessing most of the standard ML algorithms don't really scale. Any other suggestions on things to learn would be great!

Read the article

Is this a good job description? What title would you give this position?

- by Zack Peterson

Department: Information Technology Reports To: Chief Information Officer Purpose: Company's ________________ is specifically engaged in the development of World Wide Web applications and distributed network applications. This person is concerned with all facets of the software development process and specializes in software product management. He or she contributes to projects in an application architect role and also performs individual programming tasks. Essential Duties & Responsibilities: This person is involved in all aspects of the software development process such as: Participation in software product definitions, including requirements analysis and specification Development and refinement of simulations or prototypes to confirm requirements Feasibility and cost-benefit analysis, including the choice of architecture and framework Application and database design Implementation (e.g. installation, configuration, customization, integration, data migration) Authoring of documentation needed by users and partners Testing, including defining/supporting acceptance testing and gathering feedback from pre-release testers Participation in software release and post-release activities, including support for product launch evangelism (e.g. developing demonstrations and/or samples) and subsequent product build/release cycles Maintenance Qualifications: Bachelor's degree in computer science or software engineering Several years of professional programming experience Proficiency in the general technology of the World Wide Web: Hypertext Transfer Protocol (HTTP) Hypertext Markup Language (HTML) JavaScript Cascading Style Sheets (CSS) Proficiency in the following principles, practices, and techniques: Accessibility Interoperability Usability Security (especially prevention of SQL injection and cross-site scripting (XSS) attacks) Object-oriented programming (e.g. encapsulation, inheritance, modularity, polymorphism, etc.) Relational database design (e.g. normalization, orthogonality) Search engine optimization (SEO) Asynchronous JavaScript and XML (AJAX) Proficiency in the following specific technologies utilized by Company: C# or Visual Basic .NET ADO.NET (including ADO.NET Entity Framework) ASP.NET (including ASP.NET MVC Framework) Windows Presentation Foundation (WPF) Language Integrated Query (LINQ) Extensible Application Markup Language (XAML) jQuery Transact-SQL (T-SQL) Microsoft Visual Studio Microsoft Internet Information Services (IIS) Microsoft SQL Server Adobe Photoshop

Read the article

Simplest way to automatically alter "const" value in Java during compile time

- by Michael Mao

Hi all: This is a question corresponds to my uni assignment so I am very sorry to say I cannot adopt any one of the following best practices in a short time -- you know -- assignment is due tomorrow :( link to Best way to alter const in Java on StackOverflow Basically the only task (I hope so) left for me is the performance tuning. I've got a bunch of predefined "const" values in my single-class agent source code like this: //static final values private static final long FT_THRESHOLD = 400; private static final long FT_THRESHOLD_MARGIN = 50; private static final long FT_SMOOTH_PRICE_IDICATOR = 20; private static final long FT_BUY_PRICE_MODIFIER = 0; private static final long FT_LAST_ROUNDS_STARTTIME = 90; private static final long FT_AMOUNT_ADJUST_MODIFIER = 5; private static final long FT_HISTORY_PIRCES_LENGTH = 10; private static final long FT_TRACK_DURATION = 5; private static final int MAX_BED_BID_NUM_PER_AUC = 12; I can definitely alter the values manually and then compile the code to give it another go around. But the execution time for a thorough "statistic analysis" usually requires over 2000 times of execution, which will typically lasts more than half an hour on my own laptop... So I hope there is a way to alter values using other ways than dig into the source code to change the "const" values there, so I can automatically distributed compiled code to other people's PC and let them run the statistic analysis instead. One other reason for a automatically value adjustment is that I can try using my own agent to defeat itself by choosing different "const" values. Although my values are derived from previous history and statistical results, they are far from the most optimized values. I hope there is a easy way so I can quickly adopt that so to have a good sleep tonight while the computer does everything for me... :) Any hints on this sort of stuff? Any suggestion is welcomed and much appreciated.

Read the article

Can ScalaCheck/Specs warnings safely be ignored when using SBT with ScalaTest?

- by pdbartlett

I have a simple FunSuite-based ScalaTest: package pdbartlett.hello_sbt import org.scalatest.FunSuite class SanityTest extends FunSuite { test("a simple test") { assert(true) } test("a very slightly more complicated test - purposely fails") { assert(42 === (6 * 9)) } } Which I'm running with the following SBT project config: import sbt._ class HelloSbtProject(info: ProjectInfo) extends DefaultProject(info) { // Dummy action, just to show config working OK. lazy val solveQ = task { println("42"); None } // Managed dependencies val scalatest = "org.scalatest" % "scalatest" % "1.0" % "test" } However, when I runsbt test I get the following warnings: ... [info] == test-compile == [info] Source analysis: 0 new/modified, 0 indirectly invalidated, 0 removed. [info] Compiling test sources... [info] Nothing to compile. [warn] Could not load superclass 'org.scalacheck.Properties' : java.lang.ClassNotFoundException: org.scalacheck.Properties [warn] Could not load superclass 'org.specs.Specification' : java.lang.ClassNotFoundException: org.specs.Specification [warn] Could not load superclass 'org.specs.Specification' : java.lang.ClassNotFoundException: org.specs.Specification [info] Post-analysis: 3 classes. [info] == test-compile == For the moment I'm assuming these are just "noise" (caused by the unified test interface?) and that I can safely ignore them. But it is slightly annoying to some inner OCD part of me (though not so annoying that I'm prepared to add dependencies for the other frameworks). Is this a correct assumption, or are there subtle errors in my test/config code? If it is safe to ignore, is there any other way to suppress these errors, or do people routinely include all three frameworks so they can pick and choose the best approach for different tests? TIA, Paul. (ADDED: scala v2.7.7 and sbt v0.7.4)

Read the article

Using an element against an entire list in Haskell

- by Snick

I have an assignment and am currently caught in one section of what I'm trying to do. Without going in to specific detail here is the basic layout: I'm given a data element, f, that holds four different types inside (each with their own purpose): data F = F Float Int, Int a function: func :: F -> F-> Q Which takes two data elements and (by simple calculations) returns a type that is now an updated version of one of the types in the first f. I now have an entire list of these elements and need to run the given function using one data element and return the type's value (not the data element). My first analysis was to use a foldl function: myfunc :: F -> [F] -> Q myfunc y [] = func y y -- func deals with the same data element calls myfunc y (x:xs) = foldl func y (x:xs) however I keep getting the same error: "Couldn't match expected type 'F' against inferred type 'Q'. In the first argument of 'foldl', namely 'myfunc' In the expression: foldl func y (x:xs) I apologise for such an abstract analysis on my problem but could anyone give me an idea as to what I should do? Should I even use a fold function or is there recursion I'm not thinking about?

Read the article

R: Are there any alternatives to loops for subsetting from an optimization standpoint?

- by Adam

A recurring analysis paradigm I encounter in my research is the need to subset based on all different group id values, performing statistical analysis on each group in turn, and putting the results in an output matrix for further processing/summarizing. How I typically do this in R is something like the following: data.mat <- read.csv("...") groupids <- unique(data.mat$ID) #Assume there are then 100 unique groups results <- matrix(rep("NA",300),ncol=3,nrow=100) for(i in 1:100) { tempmat <- subset(data.mat,ID==groupids[i]) #Run various stats on tempmat (correlations, regressions, etc), checking to #make sure this specific group doesn't have NAs in the variables I'm using #and assign results to x, y, and z, for example. results[i,1] <- x results[i,2] <- y results[i,3] <- z } This ends up working for me, but depending on the size of the data and the number of groups I'm working with, this can take up to three days. Besides branching out into parallel processing, is there any "trick" for making something like this run faster? For instance, converting the loops into something else (something like an apply with a function containing the stats I want to run inside the loop), or eliminating the need to actually assign the subset of data to a variable?

Read the article

SPSS - sum of squares change radically with slight model changes in ANOVA??

- by Pat

I have noticed that the sum of squares in my models can change fairly radically with even the slightest adjustment to my models???? Is this normal???? I'm using SPSS 16, and both models presented below used the same data and variables with only one small change - categorizing one of the variables as either a 2 level or 3 level variable. Details - using a 2 x 2 x 6 mixed model ANOVA with the 6 being the repeated measure i get the following in the between group analysis ------------------------------------------------------------ Source | Type III SS | df | MS | F | Sig ------------------------------------------------------------ intercept | 4086.46 | 1 | 4086.46 | 104.93 | .000 X | 224.61 | 1 | 224.61 | 5.77 | .019 Y | 2.60 | 1 | 2.60 | .07 | .80 X by Y | 19.25 | 1 | 19.25 | .49 | .49 Error | 2570.40 | 66 | 38.95 | Then, when I use the exact same data but a slightly different model in which variable Y has 3 levels instead of 2 levels I get the following ------------------------------------------------------------ Source | Type III SS | df | MS | F | Sig ------------------------------------------------------------ intercept | 3603.88 | 1 | 3603.88 | 90.89 | .000 X | 171.89 | 1 | 171.89 | 4.34 | .041 Y | 19.23 | 2 | 9.62 | .24 | .79 X by Y | 17.90 | 2 | 17.90 | .80 | .80 Error | 2537.76 | 64 | 39.65 | I don't understand why variable X would have a different sum of squares simply because variable Y gets devided up into 3 levels instead of 2. This is also the case in the within groups analysis too. Please help me understand :D Thank you in advance Pat

Read the article

Git add not working with .png files?

- by D Lawson

I have a dirty working tree, dirty because I made changes to source files and touched up some images. I was trying to add just the images to the index, so I ran this command: git add *.png But, this doesn't add the files. There were a few new image files that were added, but none of the ones that were modified/pre-existing were added. What gives? Edit: Here is some relevant terminal output $ git status # On branch master # # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: src/main/java/net/plugins/analysis/FormMatcher.java # modified: src/main/resources/icons/doctor_edit_male.png # modified: src/main/resources/icons/doctor_female.png # # Untracked files: # (use "git add <file>..." to include in what will be committed) # # src/main/resources/icons/arrow_up.png # src/main/resources/icons/bullet_arrow_down.png # src/main/resources/icons/bullet_arrow_up.png no changes added to commit (use "git add" and/or "git commit -a") Then executed "git add *.png" (no output after command) Then: $ git status # On branch master # # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # # new file: src/main/resources/icons/arrow_up.png # new file: src/main/resources/icons/bullet_arrow_down.png # new file: src/main/resources/icons/bullet_arrow_up.png # # Changed but not updated: # (use "git add <file>..." to update what will be committed) # (use "git checkout -- <file>..." to discard changes in working directory) # # modified: src/main/java/net/plugins/analysis/FormMatcher.java # modified: src/main/resources/icons/doctor_edit_female.png # modified: src/main/resources/icons/doctor_edit_male.png

Read the article

Dimension Reduction in Categorical Data with missing values

- by user227290

I have a regression model in which the dependent variable is continuous but ninety percent of the independent variables are categorical(both ordered and unordered) and around thirty percent of the records have missing values(to make matters worse they are missing randomly without any pattern, that is, more that forty five percent of the data hava at least one missing value). There is no a priori theory to choose the specification of the model so one of the key tasks is dimension reduction before running the regression. While I am aware of several methods for dimension reduction for continuous variables I am not aware of a similar statical literature for categorical data (except, perhaps, as a part of correspondence analysis which is basically a variation of principal component analysis on frequency table). Let me also add that the dataset is of moderate size 500000 observations with 200 variables. I have two questions. Is there a good statistical reference out there for dimension reduction for categorical data along with robust imputation (I think the first issue is imputation and then dimension reduction)? This is linked to implementation of above problem. I have used R extensively earlier and tend to use transcan and impute function heavily for continuous variables and use a variation of tree method to impute categorical values. I have a working knowledge of Python so if something is nice out there for this purpose then I will use it. Any implementation pointers in python or R will be of great help. Thank you.

Read the article

Where should test classes be stored in the project?

- by limc

I build all my web projects at work using RAD/Eclipse, and I'm interested to know where do you guys normally store your test's *.class files. All my web projects have 2 source folders: "src" for source and "test" for testcases. The generated *.class files for both source folders are currently placed under WebContent/WEB-INF/classes folder. I want to separate the test *.class files from the src *.class files for 2 reasons:- There's no point to store them in WebContent/WEB-INF/classes and deploy them in production. Sonar and some other static code analysis tools don't produce an accurate static code analysis because it takes account of my crappy yet correct testcase code. So, right now, I have the following output folders:- "src" source folder compiles to WebContent/WEB-INF/classes folder. "test" source folder compiles to target/test-classes folder. Now, I'm getting this warning from RAD:- Broken single-root rule: A project may not contain more than one output folder. So, it seems like Eclipse-based IDEs prefer one project = one output folder, yet it provides an option for me to set up a custom output folder for my additional source folder from the "build path" dialog, and then it barks at me. I know I can just disable this warning myself, but I want to know how you guys handle this. Thanks.

Read the article

What type of data store should I use for my ios app?

- by mwiederrecht

I am pretty new to ios and using servers so forgive me. I am building an ios app for research. I need to monitor things that the user does and then push it up to a server for analysis (yes, with user and IRB permission). On the client's side I need to keep quite a bit of data that won't really change except in the case of pulling an updated version from the server, and then a minimal amount of user-specific data. Most of the data I will collect needs to be pushed to a server for analysis and then can be deleted from the client side. I am struggling to figure out what kind of data store I need to use, especially since I am not quite sure how the pushing and pulling from the server process works yet. Does it make sense to use Core Data? XML? SQLite? I like the Core Data idea, but I am not sure what kind of problems I will run into when I need to send large amounts of data to it and from it from the server. I imagine I might need to send data in a different form than it is probably stored in on either end - so what kind of overhead am I likely to run into in the process of converting that data? Is there a good format to save stuff in that would work well for me on both ends AND for sending the data? As you can probably tell, I could use some advice. Thanks!

Read the article

How can I call multiple nutrient information from ESHA Research API? (apid.esha.com)

- by user1833044

I want to call ESHA Research nutrient REST API. I cannot seem to figure out how to call multiple nutrients using ESHA REST API. So far I am calling the following and only able to retrieve the calories, or protein, or another type of nutrient information. So I was hoping someone had experience in retrieving all the nutrient information with one call. Is this possible? This is how I call to retrieve the TWIX nutrient http://api.esha.com/analysis?apikey=xxxx&fo=urn:uuid:81d268ac-f1dc-4991-98c1-1b4d3a5006da (returns calories, please note the api key is not xxxx but instead a key generated from Esha once you sign up as developer) The return is JSON format. If I want to call fat it would be the following http://api.esha.com/analysis?apikey=xxxx&fo=urn:uuid:81d268ac-f1dc-4991-98c1-1b4d3a5006da&n=urn:uuid:589294dc-3dcc-4b64-be06-c07e7f65c4bd How can I make a call once and get a return of all the nutrients (so Fat, Calories, Carbs, Vitamins, etc..) for a particular food ID? I have researched and looked at this for a while and cannot seem to find the answer. Thanks in advance for your help.

Read the article

.NET HTML Sanitation for rich HTML Input

- by Rick Strahl

Recently I was working on updating a legacy application to MVC 4 that included free form text input. When I set up the new site my initial approach was to not allow any rich HTML input, only simple text formatting that would respect a few simple HTML commands for bold, lists etc. and automatically handles line break processing for new lines and paragraphs. This is typical for what I do with most multi-line text input in my apps and it works very well with very little development effort involved. Then the client sprung another note: Oh by the way we have a bunch of customers (real estate agents) who need to post complete HTML documents. Oh uh! There goes the simple theory. After some discussion and pleading on my part (<snicker>) to try and avoid this type of raw HTML input because of potential XSS issues, the client decided to go ahead and allow raw HTML input anyway. There has been lots of discussions on this subject on StackOverFlow (and here and here) but to after reading through some of the solutions I didn't really find anything that would work even closely for what I needed. Specifically we need to be able to allow just about any HTML markup, with the exception of script code. Remote CSS and Images need to be loaded, links need to work and so. While the 'legit' HTML posted by these agents is basic in nature it does span most of the full gamut of HTML (4). Most of the solutions XSS prevention/sanitizer solutions I found were way to aggressive and rendered the posted output unusable mostly because they tend to strip any externally loaded content. In short I needed a custom solution. I thought the best solution to this would be to use an HTML parser - in this case the Html Agility Pack - and then to run through all the HTML markup provided and remove any of the blacklisted tags and a number of attributes that are prone to JavaScript injection. There's much discussion on whether to use blacklists vs. whitelists in the discussions mentioned above, but I found that whitelists can make sense in simple scenarios where you might allow manual HTML input, but when you need to allow a larger array of HTML functionality a blacklist is probably easier to manage as the vast majority of elements and attributes could be allowed. Also white listing gets a bit more complex with HTML5 and the new proliferation of new HTML tags and most new tags generally don't affect XSS issues directly. Pure whitelisting based on elements and attributes also doesn't capture many edge cases (see some of the XSS cheat sheets listed below) so even with a white list, custom logic is still required to handle many of those edge cases. The Microsoft Web Protection Library (AntiXSS) My first thought was to check out the Microsoft AntiXSS library. Microsoft has an HTML Encoding and Sanitation library in the Microsoft Web Protection Library (formerly AntiXSS Library) on CodePlex, which provides stricter functions for whitelist encoding and sanitation. Initially I thought the Sanitation class and its static members would do the trick for me,but I found that this library is way too restrictive for my needs. Specifically the Sanitation class strips out images and links which rendered the full HTML from our real estate clients completely useless. I didn't spend much time with it, but apparently I'm not alone if feeling this library is not really useful without some way to configure operation. To give you an example of what didn't work for me with the library here's a small and simple HTML fragment that includes script, img and anchor tags. I would expect the script to be stripped and everything else to be left intact. Here's the original HTML:var value = "<b>Here</b> <script>alert('hello')</script> we go. Visit the " + "<a href='http://west-wind.com'>West Wind</a> site. " + "<img src='http://west-wind.com/images/new.gif' /> " ; and the code to sanitize it with the AntiXSS Sanitize class:@Html.Raw(Microsoft.Security.Application.Sanitizer.GetSafeHtmlFragment(value)) This produced a not so useful sanitized string: Here we go. Visit the <a>West Wind</a> site. While it removed the <script> tag (good) it also removed the href from the link and the image tag altogether (bad). In some situations this might be useful, but for most tasks I doubt this is the desired behavior. While links can contain javascript: references and images can 'broadcast' information to a server, without configuration to tell the library what to restrict this becomes useless to me. I couldn't find any way to customize the white list, nor is there code available in this 'open source' library on CodePlex. Using Html Agility Pack for HTML Parsing The WPL library wasn't going to cut it. After doing a bit of research I decided the best approach for a custom solution would be to use an HTML parser and inspect the HTML fragment/document I'm trying to import. I've used the HTML Agility Pack before for a number of apps where I needed an HTML parser without requiring an instance of a full browser like the Internet Explorer Application object which is inadequate in Web apps. In case you haven't checked out the Html Agility Pack before, it's a powerful HTML parser library that you can use from your .NET code. It provides a simple, parsable HTML DOM model to full HTML documents or HTML fragments that let you walk through each of the elements in your document. If you've used the HTML or XML DOM in a browser before you'll feel right at home with the Agility Pack. Blacklist based HTML Parsing to strip XSS Code For my purposes of HTML sanitation, the process involved is to walk the HTML document one element at a time and then check each element and attribute against a blacklist. There's quite a bit of argument of what's better: A whitelist of allowed items or a blacklist of denied items. While whitelists tend to be more secure, they also require a lot more configuration. In the case of HTML5 a whitelist could be very extensive. For what I need, I only want to ensure that no JavaScript is executed, so a blacklist includes the obvious <script> tag plus any tag that allows loading of external content including <iframe>, <object>, <embed> and <link> etc. <form> is also excluded to avoid posting content to a different location. I also disallow <head> and <meta> tags in particular for my case, since I'm only allowing posting of HTML fragments. There is also some internal logic to exclude some attributes or attributes that include references to JavaScript or CSS expressions. The default tag blacklist reflects my use case, but is customizable and can be added to. Here's my HtmlSanitizer implementation:using System.Collections.Generic; using System.IO; using System.Xml; using HtmlAgilityPack; namespace Westwind.Web.Utilities { public class HtmlSanitizer { public HashSet<string> BlackList = new HashSet<string>() { { "script" }, { "iframe" }, { "form" }, { "object" }, { "embed" }, { "link" }, { "head" }, { "meta" } }; /// <summary> /// Cleans up an HTML string and removes HTML tags in blacklist /// </summary> /// <param name="html"></param> /// <returns></returns> public static string SanitizeHtml(string html, params string[] blackList) { var sanitizer = new HtmlSanitizer(); if (blackList != null && blackList.Length > 0) { sanitizer.BlackList.Clear(); foreach (string item in blackList) sanitizer.BlackList.Add(item); } return sanitizer.Sanitize(html); } /// <summary> /// Cleans up an HTML string by removing elements /// on the blacklist and all elements that start /// with onXXX . /// </summary> /// <param name="html"></param> /// <returns></returns> public string Sanitize(string html) { var doc = new HtmlDocument(); doc.LoadHtml(html); SanitizeHtmlNode(doc.DocumentNode); //return doc.DocumentNode.WriteTo(); string output = null; // Use an XmlTextWriter to create self-closing tags using (StringWriter sw = new StringWriter()) { XmlWriter writer = new XmlTextWriter(sw); doc.DocumentNode.WriteTo(writer); output = sw.ToString(); // strip off XML doc header if (!string.IsNullOrEmpty(output)) { int at = output.IndexOf("?>"); output = output.Substring(at + 2); } writer.Close(); } doc = null; return output; } private void SanitizeHtmlNode(HtmlNode node) { if (node.NodeType == HtmlNodeType.Element) { // check for blacklist items and remove if (BlackList.Contains(node.Name)) { node.Remove(); return; } // remove CSS Expressions and embedded script links if (node.Name == "style") { if (string.IsNullOrEmpty(node.InnerText)) { if (node.InnerHtml.Contains("expression") || node.InnerHtml.Contains("javascript:")) node.ParentNode.RemoveChild(node); } } // remove script attributes if (node.HasAttributes) { for (int i = node.Attributes.Count - 1; i >= 0; i--) { HtmlAttribute currentAttribute = node.Attributes[i]; var attr = currentAttribute.Name.ToLower(); var val = currentAttribute.Value.ToLower(); span style="background: white; color: green">// remove event handlers if (attr.StartsWith("on")) node.Attributes.Remove(currentAttribute); // remove script links else if ( //(attr == "href" || attr== "src" || attr == "dynsrc" || attr == "lowsrc") && val != null && val.Contains("javascript:")) node.Attributes.Remove(currentAttribute); // Remove CSS Expressions else if (attr == "style" && val != null && val.Contains("expression") || val.Contains("javascript:") || val.Contains("vbscript:")) node.Attributes.Remove(currentAttribute); } } } // Look through child nodes recursively if (node.HasChildNodes) { for (int i = node.ChildNodes.Count - 1; i >= 0; i--) { SanitizeHtmlNode(node.ChildNodes[i]); } } } } } Please note: Use this as a starting point only for your own parsing and review the code for your specific use case! If your needs are less lenient than mine were you can you can make this much stricter by not allowing src and href attributes or CSS links if your HTML doesn't allow it. You can also check links for external URLs and disallow those - lots of options. The code is simple enough to make it easy to extend to fit your use cases more specifically. It's also quite easy to make this code work using a WhiteList approach if you want to go that route. The code above is semi-generic for allowing full featured HTML fragments that only disallow script related content. The Sanitize method walks through each node of the document and then recursively drills into all of its children until the entire document has been traversed. Note that the code here uses an XmlTextWriter to write output - this is done to preserve XHTML style self-closing tags which are otherwise left as non-self-closing tags. The sanitizer code scans for blacklist elements and removes those elements not allowed. Note that the blacklist is configurable either in the instance class as a property or in the static method via the string parameter list. Additionally the code goes through each element's attributes and looks for a host of rules gleaned from some of the XSS cheat sheets listed at the end of the post. Clearly there are a lot more XSS vulnerabilities, but a lot of them apply to ancient browsers (IE6 and versions of Netscape) - many of these glaring holes (like CSS expressions - WTF IE?) have been removed in modern browsers. What a Pain To be honest this is NOT a piece of code that I wanted to write. I think building anything related to XSS is better left to people who have far more knowledge of the topic than I do. Unfortunately, I was unable to find a tool that worked even closely for me, or even provided a working base. For the project I was working on I had no choice and I'm sharing the code here merely as a base line to start with and potentially expand on for specific needs. It's sad that Microsoft Web Protection Library is currently such a train wreck - this is really something that should come from Microsoft as the systems vendor or possibly a third party that provides security tools. Luckily for my application we are dealing with a authenticated and validated users so the user base is fairly well known, and relatively small - this is not a wide open Internet application that's directly public facing. As I mentioned earlier in the post, if I had my way I would simply not allow this type of raw HTML input in the first place, and instead rely on a more controlled HTML input mechanism like MarkDown or even a good HTML Edit control that can provide some limits on what types of input are allowed. Alas in this case I was overridden and we had to go forward and allow *any* raw HTML posted. Sometimes I really feel sad that it's come this far - how many good applications and tools have been thwarted by fear of XSS (or worse) attacks? So many things that could be done *if* we had a more secure browser experience and didn't have to deal with every little script twerp trying to hack into Web pages and obscure browser bugs. So much time wasted building secure apps, so much time wasted by others trying to hack apps… We're a funny species - no other species manages to waste as much time, effort and resources as we humans do :-) Resources Code on GitHub Html Agility Pack XSS Cheat Sheet XSS Prevention Cheat Sheet Microsoft Web Protection Library (AntiXss) StackOverflow Links: http://stackoverflow.com/questions/341872/html-sanitizer-for-net http://blog.stackoverflow.com/2008/06/safe-html-and-xss/ http://code.google.com/p/subsonicforums/source/browse/trunk/SubSonic.Forums.Data/HtmlScrubber.cs?r=61© Rick Strahl, West Wind Technologies, 2005-2012Posted in Security HTML ASP.NET JavaScript Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article

jQuery Globalization Plugin from Microsoft

- by ScottGu

Last month I blogged about how Microsoft is starting to make code contributions to jQuery, and about some of the first code contributions we were working on: jQuery Templates and Data Linking support. Today, we released a prototype of a new jQuery Globalization Plugin that enables you to add globalization support to your JavaScript applications. This plugin includes globalization information for over 350 cultures ranging from Scottish Gaelic, Frisian, Hungarian, Japanese, to Canadian English. We will be releasing this plugin to the community as open-source. You can download our prototype for the jQuery Globalization plugin from our Github repository: http://github.com/nje/jquery-glob You can also download a set of samples that demonstrate some simple use-cases with it here. Understanding Globalization The jQuery Globalization plugin enables you to easily parse and format numbers, currencies, and dates for different cultures in JavaScript. For example, you can use the Globalization plugin to display the proper currency symbol for a culture: You also can use the Globalization plugin to format dates so that the day and month appear in the right order and the day and month names are correctly translated: Notice above how the Arabic year is displayed as 1431. This is because the year has been converted to use the Arabic calendar. Some cultural differences, such as different currency or different month names, are obvious. Other cultural differences are surprising and subtle. For example, in some cultures, the grouping of numbers is done unevenly. In the "te-IN" culture (Telugu in India), groups have 3 digits and then 2 digits. The number 1000000 (one million) is written as "10,00,000". Some cultures do not group numbers at all. All of these subtle cultural differences are handled by the jQuery Globalization plugin automatically. Getting dates right can be especially tricky. Different cultures have different calendars such as the Gregorian and UmAlQura calendars. A single culture can even have multiple calendars. For example, the Japanese culture uses both the Gregorian calendar and a Japanese calendar that has eras named after Japanese emperors. The Globalization Plugin includes methods for converting dates between all of these different calendars. Using Language Tags The jQuery Globalization plugin uses the language tags defined in the RFC 4646 and RFC 5646 standards to identity cultures (see http://tools.ietf.org/html/rfc5646). A language tag is composed out of one or more subtags separated by hyphens. For example: Language Tag Language Name (in English) en-AU English (Australia) en-BZ English (Belize) en-CA English (Canada) Id Indonesian zh-CHS Chinese (Simplified) Legacy Zu isiZulu Notice that a single language, such as English, can have several language tags. Speakers of English in Canada format numbers, currencies, and dates using different conventions than speakers of English in Australia or the United States. You can find the language tag for a particular culture by using the Language Subtag Lookup tool located here: http://rishida.net/utils/subtags/ The jQuery Globalization plugin download includes a folder named globinfo that contains the information for each of the 350 cultures. Actually, this folder contains more than 700 files because the folder includes both minified and un-minified versions of each file. For example, the globinfo folder includes JavaScript files named jQuery.glob.en-AU.js for English Australia, jQuery.glob.id.js for Indonesia, and jQuery.glob.zh-CHS for Chinese (Simplified) Legacy. Example: Setting a Particular Culture Imagine that you have been asked to create a German website and want to format all of the dates, currencies, and numbers using German formatting conventions correctly in JavaScript on the client. The HTML for the page might look like this: Notice the span tags above. They mark the areas of the page that we want to format with the Globalization plugin. We want to format the product price, the date the product is available, and the units of the product in stock. To use the jQuery Globalization plugin, we’ll add three JavaScript files to the page: the jQuery library, the jQuery Globalization plugin, and the culture information for a particular language: In this case, I’ve statically added the jQuery.glob.de-DE.js JavaScript file that contains the culture information for German. The language tag “de-DE” is used for German as spoken in Germany. Now that I have all of the necessary scripts, I can use the Globalization plugin to format the product price, date available, and units in stock values using the following client-side JavaScript: The jQuery Globalization plugin extends the jQuery library with new methods - including new methods named preferCulture() and format(). The preferCulture() method enables you to set the default culture used by the jQuery Globalization plugin methods. Notice that the preferCulture() method accepts a language tag. The method will find the closest culture that matches the language tag. The $.format() method is used to actually format the currencies, dates, and numbers. The second parameter passed to the $.format() method is a format specifier. For example, passing “c” causes the value to be formatted as a currency. The ReadMe file at github details the meaning of all of the various format specifiers: http://github.com/nje/jquery-glob When we open the page in a browser, everything is formatted correctly according to German language conventions. A euro symbol is used for the currency symbol. The date is formatted using German day and month names. Finally, a period instead of a comma is used a number separator: You can see a running example of the above approach with the 3_GermanSite.htm file in this samples download. Example: Enabling a User to Dynamically Select a Culture In the previous example we explicitly said that we wanted to globalize in German (by referencing the jQuery.glob.de-DE.js file). Let’s now look at the first of a few examples that demonstrate how to dynamically set the globalization culture to use. Imagine that you want to display a dropdown list of all of the 350 cultures in a page. When someone selects a culture from the dropdown list, you want all of the dates in the page to be formatted using the selected culture. Here’s the HTML for the page: Notice that all of the dates are contained in a <span> tag with a data-date attribute (data-* attributes are a new feature of HTML 5 that conveniently also still work with older browsers). We’ll format the date represented by the data-date attribute when a user selects a culture from the dropdown list. In order to display dates for any possible culture, we’ll include the jQuery.glob.all.js file like this: The jQuery Globalization plugin includes a JavaScript file named jQuery.glob.all.js. This file contains globalization information for all of the more than 350 cultures supported by the Globalization plugin. At 367KB minified, this file is not small. Because of the size of this file, unless you really need to use all of these cultures at the same time, we recommend that you add the individual JavaScript files for particular cultures that you intend to support instead of the combined jQuery.glob.all.js to a page. In the next sample I’ll show how to dynamically load just the language files you need. Next, we’ll populate the dropdown list with all of the available cultures. We can use the $.cultures property to get all of the loaded cultures: Finally, we’ll write jQuery code that grabs every span element with a data-date attribute and format the date: The jQuery Globalization plugin’s parseDate() method is used to convert a string representation of a date into a JavaScript date. The plugin’s format() method is used to format the date. The “D” format specifier causes the date to be formatted using the long date format. And now the content will be globalized correctly regardless of which of the 350 languages a user visiting the page selects. You can see a running example of the above approach with the 4_SelectCulture.htm file in this samples download. Example: Loading Globalization Files Dynamically As mentioned in the previous section, you should avoid adding the jQuery.glob.all.js file to a page whenever possible because the file is so large. A better alternative is to load the globalization information that you need dynamically. For example, imagine that you have created a dropdown list that displays a list of languages: The following jQuery code executes whenever a user selects a new language from the dropdown list. The code checks whether the globalization file associated with the selected language has already been loaded. If the globalization file has not been loaded then the globalization file is loaded dynamically by taking advantage of the jQuery $.getScript() method. The globalizePage() method is called after the requested globalization file has been loaded, and contains the client-side code to perform the globalization. The advantage of this approach is that it enables you to avoid loading the entire jQuery.glob.all.js file. Instead you only need to load the files that you need and you don’t need to load the files more than once. The 5_Dynamic.htm file in this samples download demonstrates how to implement this approach. Example: Setting the User Preferred Language Automatically Many websites detect a user’s preferred language from their browser settings and automatically use it when globalizing content. A user can set a preferred language for their browser. Then, whenever the user requests a page, this language preference is included in the request in the Accept-Language header. When using Microsoft Internet Explorer, you can set your preferred language by following these steps: Select the menu option Tools, Internet Options. Select the General tab. Click the Languages button in the Appearance section. Click the Add button to add a new language to the list of languages. Move your preferred language to the top of the list. Notice that you can list multiple languages in the Language Preference dialog. All of these languages are sent in the order that you listed them in the Accept-Language header: Accept-Language: fr-FR,id-ID;q=0.7,en-US;q=0.3 Strangely, you cannot retrieve the value of the Accept-Language header from client JavaScript. Microsoft Internet Explorer and Mozilla Firefox support a bevy of language related properties exposed by the window.navigator object, such as windows.navigator.browserLanguage and window.navigator.language, but these properties represent either the language set for the operating system or the language edition of the browser. These properties don’t enable you to retrieve the language that the user set as his or her preferred language. The only reliable way to get a user’s preferred language (the value of the Accept-Language header) is to write server code. For example, the following ASP.NET page takes advantage of the server Request.UserLanguages property to assign the user’s preferred language to a client JavaScript variable named acceptLanguage (which then allows you to access the value using client-side JavaScript): In order for this code to work, the culture information associated with the value of acceptLanguage must be included in the page. For example, if someone’s preferred culture is fr-FR (French in France) then you need to include either the jQuery.glob.fr-FR.js or the jQuery.glob.all.js JavaScript file in the page or the culture information won’t be available. The “6_AcceptLanguages.aspx” sample in this samples download demonstrates how to implement this approach. If the culture information for the user’s preferred language is not included in the page then the $.preferCulture() method will fall back to using the neutral culture (for example, using jQuery.glob.fr.js instead of jQuery.glob.fr-FR.js). If the neutral culture information is not available then the $.preferCulture() method falls back to the default culture (English). Example: Using the Globalization Plugin with the jQuery UI DatePicker One of the goals of the Globalization plugin is to make it easier to build jQuery widgets that can be used with different cultures. We wanted to make sure that the jQuery Globalization plugin could work with existing jQuery UI plugins such as the DatePicker plugin. To that end, we created a patched version of the DatePicker plugin that can take advantage of the Globalization plugin when rendering a calendar. For example, the following figure illustrates what happens when you add the jQuery Globalization and the patched jQuery UI DatePicker plugin to a page and select Indonesian as the preferred culture: Notice that the headers for the days of the week are displayed using Indonesian day name abbreviations. Furthermore, the month names are displayed in Indonesian. You can download the patched version of the jQuery UI DatePicker from our github website. Or you can use the version included in this samples download and used by the 7_DatePicker.htm sample file. Summary I’m excited about our continuing participation in the jQuery community. This Globalization plugin is the third jQuery plugin that we’ve released. We’ve really appreciated all of the great feedback and design suggestions on the jQuery templating and data-linking prototypes that we released earlier this year. We also want to thank the jQuery and jQuery UI teams for working with us to create these plugins. Hope this helps, Scott P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. You can follow me at: twitter.com/scottgu

Read the article

MySQL Syslog Audit Plugin

- by jonathonc

This post shows the construction process of the Syslog Audit plugin that was presented at MySQL Connect 2012. It is based on an environment that has the appropriate development tools enabled including gcc,g++ and cmake. It also assumes you have downloaded the MySQL source code (5.5.16 or higher) and have compiled and installed the system into the /usr/local/mysql directory ready for use. The information provided below is designed to show the different components that make up a plugin, and specifically an audit type plugin, and how it comes together to be used within the MySQL service. The MySQL Reference Manual contains information regarding the plugin API and how it can be used, so please refer there for more detailed information. The code in this post is designed to give the simplest information necessary, so handling every return code, managing race conditions etc is not part of this example code. Let's start by looking at the most basic implementation of our plugin code as seen below: /* Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved. Author: Jonathon Coombes Licence: GPL Description: An auditing plugin that logs to syslog and can adjust the loglevel via the system variables. */ #include <stdio.h> #include <string.h> #include <mysql/plugin_audit.h> #include <syslog.h> There is a commented header detailing copyright/licencing and meta-data information and then the include headers. The two important include statements for our plugin are the syslog.h plugin, which gives us the structures for syslog, and the plugin_audit.h include which has details regarding the audit specific plugin api. Note that we do not need to include the general plugin header plugin.h, as this is done within the plugin_audit.h file already. To implement our plugin within the current implementation we need to add it into our source code and compile. > cd /usr/local/src/mysql-5.5.28/plugin > mkdir audit_syslog > cd audit_syslog A simple CMakeLists.txt file is created to manage the plugin compilation: MYSQL_ADD_PLUGIN(audit_syslog audit_syslog.cc MODULE_ONLY) Run the cmake command at the top level of the source and then you can compile the plugin using the 'make' command. This results in a compiled audit_syslog.so library, but currently it is not much use to MySQL as there is no level of api defined to communicate with the MySQL service. Now we need to define the general plugin structure that enables MySQL to recognise the library as a plugin and be able to install/uninstall it and have it show up in the system. The structure is defined in the plugin.h file in the MySQL source code. /* Plugin library descriptor */ mysql_declare_plugin(audit_syslog) { MYSQL_AUDIT_PLUGIN, /* plugin type */ &audit_syslog_descriptor, /* descriptor handle */ "audit_syslog", /* plugin name */ "Author Name", /* author */ "Simple Syslog Audit", /* description */ PLUGIN_LICENSE_GPL, /* licence */ audit_syslog_init, /* init function */ audit_syslog_deinit, /* deinit function */ 0x0001, /* plugin version */ NULL, /* status variables */ NULL, /* system variables */ NULL, /* no reserves */ 0, /* no flags */ } mysql_declare_plugin_end; The general plugin descriptor above is standard for all plugin types in MySQL. The plugin type is defined along with the init/deinit functions and interface methods into the system for sharing information, and various other metadata information. The descriptors have an internally recognised version number so that plugins can be matched against the api on the running server. The other details are usually related to the type-specific methods and structures to implement the plugin. Each plugin has a type-specific descriptor as well which details how the plugin is implemented for the specific purpose of that plugin type. /* Plugin type-specific descriptor */ static struct st_mysql_audit audit_syslog_descriptor= { MYSQL_AUDIT_INTERFACE_VERSION, /* interface version */ NULL, /* release_thd function */ audit_syslog_notify, /* notify function */ { (unsigned long) MYSQL_AUDIT_GENERAL_CLASSMASK | MYSQL_AUDIT_CONNECTION_CLASSMASK } /* class mask */ }; In this particular case, the release_thd function has not been defined as it is not required. The important method for auditing is the notify function which is activated when an event occurs on the system. The notify function is designed to activate on an event and the implementation will determine how it is handled. For the audit_syslog plugin, the use of the syslog feature sends all events to the syslog for recording. The class mask allows us to determine what type of events are being seen by the notify function. There are currently two major types of event: 1. General Events: This includes general logging, errors, status and result type events. This is the main one for tracking the queries and operations on the database. 2. Connection Events: This group is based around user logins. It monitors connections and disconnections, but also if somebody changes user while connected. With most audit plugins, the principle behind the plugin is to track changes to the system over time and counters can be an important part of this process. The next step is to define and initialise the counters that are used to track the events in the service. There are 3 counters defined in total for our plugin - the # of general events, the # of connection events and the total number of events. static volatile int total_number_of_calls; /* Count MYSQL_AUDIT_GENERAL_CLASS event instances */ static volatile int number_of_calls_general; /* Count MYSQL_AUDIT_CONNECTION_CLASS event instances */ static volatile int number_of_calls_connection; The init and deinit functions for the plugin are there to be called when the plugin is activated and when it is terminated. These offer the best option to initialise the counters for our plugin: /* Initialize the plugin at server start or plugin installation. */ static int audit_syslog_init(void *arg __attribute__((unused))) { openlog("mysql_audit:",LOG_PID|LOG_PERROR|LOG_CONS,LOG_USER); total_number_of_calls= 0; number_of_calls_general= 0; number_of_calls_connection= 0; return(0); } The init function does a call to openlog to initialise the syslog functionality. The parameters are the service to log under ("mysql_audit" in this case), the syslog flags and the facility for the logging. Then each of the counters are initialised to zero and a success is returned. If the init function is not defined, it will return success by default. /* Terminate the plugin at server shutdown or plugin deinstallation. */ static int audit_syslog_deinit(void *arg __attribute__((unused))) { closelog(); return(0); } The deinit function will simply close our syslog connection and return success. Note that the syslog functionality is part of the glibc libraries and does not require any external factors. The function names are what we define in the general plugin structure, so these have to match otherwise there will be errors. The next step is to implement the event notifier function that was defined in the type specific descriptor (audit_syslog_descriptor) which is audit_syslog_notify. /* Event notifier function */ static void audit_syslog_notify(MYSQL_THD thd __attribute__((unused)), unsigned int event_class, const void *event) { total_number_of_calls++; if (event_class == MYSQL_AUDIT_GENERAL_CLASS) { const struct mysql_event_general *event_general= (const struct mysql_event_general *) event; number_of_calls_general++; syslog(audit_loglevel,"%lu: User: %s Command: %s Query: %s\n", event_general->general_thread_id, event_general->general_user, event_general->general_command, event_general->general_query ); } else if (event_class == MYSQL_AUDIT_CONNECTION_CLASS) { const struct mysql_event_connection *event_connection= (const struct mysql_event_connection *) event; number_of_calls_connection++; syslog(audit_loglevel,"%lu: User: %s@%s[%s] Event: %d Status: %d\n", event_connection->thread_id, event_connection->user, event_connection->host, event_connection->ip, event_connection->event_subclass, event_connection->status ); } } In the case of an event, the notifier function is called. The first step is to increment the total number of events that have occurred in our database.The event argument is then cast into the appropriate event structure depending on the class type, of general event or connection event. The event type counters are incremented and details are sent via the syslog() function out to the system log. There are going to be different line formats and information returned since the general events have different data compared to the connection events, even though some of the details overlap, for example, user, thread id, host etc. On compiling the code now, there should be no errors and the resulting audit_syslog.so can be loaded into the server and ready to use. Log into the server and type: mysql> INSTALL PLUGIN audit_syslog SONAME 'audit_syslog.so'; This will install the plugin and will start updating the syslog immediately. Note that the audit plugin attaches to the immediate thread and cannot be uninstalled while that thread is active. This means that you cannot run the UNISTALL command until you log into a different connection (thread) on the server. Once the plugin is loaded, the system log will show output such as the following: Oct 8 15:33:21 machine mysql_audit:[8337]: 87: User: root[root] @ localhost [] Command: (null) Query: INSTALL PLUGIN audit_syslog SONAME 'audit_syslog.so' Oct 8 15:33:21 machine mysql_audit:[8337]: 87: User: root[root] @ localhost [] Command: Query Query: INSTALL PLUGIN audit_syslog SONAME 'audit_syslog.so' Oct 8 15:33:40 machine mysql_audit:[8337]: 87: User: root[root] @ localhost [] Command: (null) Query: show tables Oct 8 15:33:40 machine mysql_audit:[8337]: 87: User: root[root] @ localhost [] Command: Query Query: show tables Oct 8 15:33:43 machine mysql_audit:[8337]: 87: User: root[root] @ localhost [] Command: (null) Query: select * from t1 Oct 8 15:33:43 machine mysql_audit:[8337]: 87: User: root[root] @ localhost [] Command: Query Query: select * from t1 It appears that two of each event is being shown, but in actuality, these are two separate event types - the result event and the status event. This could be refined further by changing the audit_syslog_notify function to handle the different event sub-types in a different manner. So far, it seems that the logging is working with events showing up in the syslog output. The issue now is that the counters created earlier to track the number of events by type are not accessible when the plugin is being run. Instead there needs to be a way to expose the plugin specific information to the service and vice versa. This could be done via the information_schema plugin api, but for something as simple as counters, the obvious choice is the system status variables. This is done using the standard structure and the declaration: /* Plugin status variables for SHOW STATUS */ static struct st_mysql_show_var audit_syslog_status[]= { { "Audit_syslog_total_calls", (char *) &total_number_of_calls, SHOW_INT }, { "Audit_syslog_general_events", (char *) &number_of_calls_general, SHOW_INT }, { "Audit_syslog_connection_events", (char *) &number_of_calls_connection, SHOW_INT }, { 0, 0, SHOW_INT } }; The structure is simply the name that will be displaying in the mysql service, the address of the associated variables, and the data type being used for the counter. It is finished with a blank structure to show that there are no more variables. Remember that status variables may have the same name for variables from other plugin, so it is considered appropriate to add the plugin name at the start of the status variable name to avoid confusion. Looking at the status variables in the mysql client shows something like the following: mysql> show global status like "audit%"; +--------------------------------+-------+ | Variable_name | Value | +--------------------------------+-------+ | Audit_syslog_connection_events | 1 | | Audit_syslog_general_events | 2 | | Audit_syslog_total_calls | 3 | +--------------------------------+-------+ 3 rows in set (0.00 sec) The final connectivity piece for the plugin is to allow the interactive change of the logging level between the plugin and the system. This requires the ability to send changes via the mysql service through to the plugin. This is done using the system variables interface and defining a single variable to keep track of the active logging level for the facility. /* Plugin system variables for SHOW VARIABLES */ static MYSQL_SYSVAR_STR(loglevel, audit_loglevel, PLUGIN_VAR_RQCMDARG, "User can specify the log level for auditing", audit_loglevel_check, audit_loglevel_update, "LOG_NOTICE"); static struct st_mysql_sys_var* audit_syslog_sysvars[] = { MYSQL_SYSVAR(loglevel), NULL }; So now the system variable 'loglevel' is defined for the plugin and associated to the global variable 'audit_loglevel'. The check or validation function is defined to make sure that no garbage values are attempted in the update of the variable. The update function is used to save the new value to the variable. Note that the audit_syslog_sysvars structure is defined in the general plugin descriptor to associate the link between the plugin and the system and how much they interact. Next comes the implementation of the validation function and the update function for the system variable. It is worth noting that if you have a simple numeric such as integers for the variable types, the validate function is often not required as MySQL will handle the automatic check and validation of simple types. /* longest valid value */ #define MAX_LOGLEVEL_SIZE 100 /* hold the valid values */ static const char *possible_modes[]= { "LOG_ERROR", "LOG_WARNING", "LOG_NOTICE", NULL }; static int audit_loglevel_check( THD* thd, /*!< in: thread handle */ struct st_mysql_sys_var* var, /*!< in: pointer to system variable */ void* save, /*!< out: immediate result for update function */ struct st_mysql_value* value) /*!< in: incoming string */ { char buff[MAX_LOGLEVEL_SIZE]; const char *str; const char **found; int length; length= sizeof(buff); if (!(str= value->val_str(value, buff, &length))) return 1; /* We need to return a pointer to a locally allocated value in "save". Here we pick to search for the supplied value in an global array of constant strings and return a pointer to one of them. The other possiblity is to use the thd_alloc() function to allocate a thread local buffer instead of the global constants. */ for (found= possible_modes; *found; found++) { if (!strcmp(*found, str)) { *(const char**)save= *found; return 0; } } return 1; } The validation function is simply to take the value being passed in via the SET GLOBAL VARIABLE command and check if it is one of the pre-defined values allowed in our possible_values array. If it is found to be valid, then the value is assigned to the save variable ready for passing through to the update function. static void audit_loglevel_update( THD* thd, /*!< in: thread handle */ struct st_mysql_sys_var* var, /*!< in: system variable being altered */ void* var_ptr, /*!< out: pointer to dynamic variable */ const void* save) /*!< in: pointer to temporary storage */ { /* assign the new value so that the server can read it */ *(char **) var_ptr= *(char **) save; /* assign the new value to the internal variable */ audit_loglevel= *(char **) save; } Since all the validation has been done already, the update function is quite simple for this plugin. The first part is to update the system variable pointer so that the server can read the value. The second part is to update our own global plugin variable for tracking the value. Notice that the save variable is passed in as a void type to allow handling of various data types, so it must be cast to the appropriate data type when assigning it to the variables. Looking at how the latest changes affect the usage of the plugin and the interaction within the server shows: mysql> show global variables like "audit%"; +-----------------------+------------+ | Variable_name | Value | +-----------------------+------------+ | audit_syslog_loglevel | LOG_NOTICE | +-----------------------+------------+ 1 row in set (0.00 sec) mysql> set global audit_syslog_loglevel="LOG_ERROR"; Query OK, 0 rows affected (0.00 sec) mysql> show global status like "audit%"; +--------------------------------+-------+ | Variable_name | Value | +--------------------------------+-------+ | Audit_syslog_connection_events | 1 | | Audit_syslog_general_events | 11 | | Audit_syslog_total_calls | 12 | +--------------------------------+-------+ 3 rows in set (0.00 sec) mysql> show global variables like "audit%"; +-----------------------+-----------+ | Variable_name | Value | +-----------------------+-----------+ | audit_syslog_loglevel | LOG_ERROR | +-----------------------+-----------+ 1 row in set (0.00 sec) So now we have a plugin that will audit the events on the system and log the details to the system log. It allows for interaction to see the number of different events within the server details and provides a mechanism to change the logging level interactively via the standard system methods of the SET command. A more complex auditing plugin may have more detailed code, but each of the above areas is what will be involved and simply expanded on to add more functionality. With the above skeleton code, it is now possible to create your own audit plugins to implement your own auditing requirements. If, however, you are not of the coding persuasion, then you could always consider the option of the MySQL Enterprise Audit plugin that is available to purchase.

Search Results

Search found 89614 results on 3585 pages for 'code analysis'.

Page 497/3585 | < Previous Page | 493 494 495 496 497 498 499 500 501 502 503 504 | Next Page >

- by Eric Lifka

- by Rob

- by yox

- by jadook

- by egarcia

- by Igor

- by Saravanan K

- by ngm

- by bradhouse

- by George2

- by Max

- by Zack Peterson

- by Michael Mao

- by pdbartlett

- by Snick

- by Adam

- by Pat

- by D Lawson

- by user227290

- by limc

- by mwiederrecht

- by user1833044

- by Rick Strahl

- by ScottGu

- by jonathonc

< Previous Page | 493 494 495 496 497 498 499 500 501 502 503 504 | Next Page >