Search Results

Search found 62526 results on 2502 pages for 'data processing'.

Page 84/2502 | < Previous Page | 80 81 82 83 84 85 86 87 88 89 90 91 | Next Page >

Testing Reference Data Mappings

- by Michael Stephenson

Background Mapping reference data is one of the common scenarios in BizTalk development and its usually a bit of a pain when you need to manage a lot of reference data whether it be through the BizTalk Cross Referencing features or some kind of custom solution. I have seen many cases where only a couple of the mapping conditions are ever tested. Approach As usual I like to see these things tested in isolation before you start using them in your BizTalk maps so you know your mapping functions are working as expected. This approach can be used for almost all of your reference data type mapping functions where you can take advantage of MSTests data driven tests to test lots of conditions without having to write millions of tests. Walk Through Rather than go into the details of this here, I'm going to call out to one of my colleagues who wrote a nice little walk through about using data driven tests a while back. Check out Callum's blog: http://callumhibbert.blogspot.com/2009/07/data-driven-tests-with-mstest.html

Read the article
Text comparison algorithm using java-diff-utils

- by java_mouse

One of the features in our project is to implement a comparison algorithm between two versions of text and provide a % change between the two versions. While I was researching, I came across google java-diff-utils project. Has anyone used this for comparing text using java-diff-utils ? Using this utility, I can get a list of "delta" which I assume I can use it for the % of difference between two versions of the text? Is this a correct way of doing this? If you have done any text comparison algorithm using Java, could you give me some pointers?

Read the article
Calling home, receiving calls and smartphone data from the US

- by Rob Farley

I got asked about calling home from the US, by someone going to the PASS Summit. I found myself thinking “there should be a blog post about this”... The easiest way to phone home is Skype - no question. Use WiFi, and if you’re calling someone who has Skype on their phone at the other end, it’s free. Even if they don’t, it’s still pretty good price-wise. The PASS Summit conference centre has good WiFI, as do the hotels, and plenty of other places (like Starbucks). But if you’re used to having data all the time, particularly when you’re walking from one place to another, then you’ll want a sim card. This also lets you receive calls more easily, not just solving your data problem. You’ll need to make sure your phone isn’t locked to your local network – get that sorted before you leave. It’s no trouble to drop by a T-mobile or AT&T store and getting a prepaid sim. You can’t get one from the airport, but if the PASS Summit is your first stop, there’s a T-mobile store on 6th in Seattle between Pine & Pike, so you can see it from the Sheraton hotel if that’s where you’re staying. AT&T isn’t far away either. But – there’s an extra step that you should be aware of. If you talk to one of these US telcos, you’ll probably (hopefully I’m wrong, but this is how it was for me recently) be told that their prepaid sims don’t work in smartphones. And they’re right – the APN gets detected and stops the data from working. But luckily, Apple (and others) have provided information about how to change the APN, which has been used by a company based in New Zealand to let you get your phone working. Basically, you send your phone browser to http://unlockit.co.nz and follow the prompts. But do this from a WiFi place somewhere, because you won’t have data access until after you’ve sorted this out... Oh, and if you get a prepaid sim with “unlimited data”, you will still need to get a Data Feature for it. And just for the record – this is WAY easier if you’re going to the UK. I dropped into a T-mobile shop there, and bought a prepaid sim card for five quid, which gave me 250MB data and some (but not much) call credit. In Australia it’s even easier, because you can buy data-enabled sim cards that work in smartphones from the airport when you arrive. I think having access to data really helps you feel at home in a different place. It means you can pull up maps, see what your friends are doing, and more. Hopefully this post helps, but feel free to post comments with extra information if you have it. @rob_farley

Read the article
Inserting an Image into a PDF

- by Cerin

Are there Linux/Ubuntu programs capable of inserting a partially transparent image into a PDF? I'm trying to "sign" a PDF document by inserting an image of my signature, but even though every OSX and Windows PDF editor seems to support this, I haven't found any Linux PDF editors that do. I've tried PDFChain, PDF Editor, Flpsed PDF Annotator, Openoffice, Scribus, Krita, and PDFSam, and none support this. Although not technically a Linux program, I tried the site pdfescape.com, but it corrupts the images it inserts, rendering it useless for this task. Note, I'm talking about keeping the PDF in PDF format, so rasterizing it to a TIF/PNG/BMP, editing it in Gimp, and then dumping it back into a PDF isn't a solution. EDIT: I might have been premature in my criticism of pdfescape.com and PDF Editor. I was viewing the resulting PDF in Evince, which was showing a mangled image, but when I opened the PDF in PDF Editor, the image rendered correctly. I've since sent the PDF to someone on Windows who confirmed the image showed correctly. It looks like the problem might be inaccurate rendering with Evince.

Read the article
How is intermediate data organized in MapReduce?

- by Pedro Cattori

From what I understand, each mapper outputs an intermediate file. The intermediate data (data contained in each intermediate file) is then sorted by key. Then, a reducer is assigned a key by the master. The reducer reads from the intermediate file containing the key and then calls reduce using the data it has read. But in detail, how is the intermediate data organized? Can a data corresponding to a key be held in multiple intermediate files? What happens when there is too much data corresponding to one key to be held by a single file? In short, how do intermediate partitions differ from intermediate files and how are these differences dealt with in the implementation?

Read the article
DataSets and XML - The Simplistic Approach

One of the first ways I learned how to read xml data from external data sources was by using a DataSet’s ReadXML function. This function takes file path for an XML document and then converts it to a Dataset. This functionality is great when you need a simple way to process an XML document. In addition the DataSet object also offers a simple way to save data in an xml format by using the WriteXML function. This function saves the current data in the DataSet to an XML file to be used later. DataSet ds = New DataSet();String filePath = “http://www.yourdomain.com/someData.xml”;String fileSavePath = “C:\Temp\Test.xml”//Read file for this locationds.readxml(filePath);//Save file to this locationds.writexml(fileSavePath); I have used the ReadXML function before when consuming data from external Rss feeds to display on one of my sites. It allows me to quickly pull in data from external sites with little to no processing. Example site: MyCreditTech.com

Read the article
Advanced Analytics Oracle Data Mining - NEW 2-Day Training Course

- by Mike.Hallett(at)Oracle-BI&EPM

A NEW 2-Day Oracle University (OU) Instructor Led Course on Oracle Data Mining has been developed for partners and customers to learn more about data mining, predictive analytics and knowledge discovery inside the Oracle Database. Oracle Data Mining, provides data mining algorithms that run native for high performance in-database model building and model deployment. This OU course is a great way to learn the advantages and benefits of "big data analytics"; mining data, building and deploying "predictive analytics" all inside the Oracle Database and to work with OBI. To register for a class, click here, then click on View Schedule to see the latest scheduled classes and/or submit your information expressing interest in attending a class.

Read the article
Parse git log by modified files

- by MrUser

I have been told to make git messages for each modified file all one line so I can use grep to find all changes to that file. For instance: $git commit -a modified: path/to/file.cpp/.h - 1) change1 , 2) change2, etc...... $git log | grep path/to/file.cpp/.h modified: path/to/file.cpp/.h - 1) change1 , 2) change2, etc...... modified: path/to/file.cpp/.h - 1) change1 , 2) change2, etc...... modified: path/to/file.cpp/.h - 1) change1 , 2) change2, etc...... That's great, but then the actual line is harder to read because it either runs off the screen or wraps and wraps and wraps. If I want to make messages like this: $git commit -a modified: path/to/file.cpp/.h 1) change1 2) change2 etc...... is there a good way to then use grep or cut or some other tool to get a readout like $git log | grep path/to/file.cpp/.h modified: path/to/file.cpp/.h 1) change1 2) change2 etc...... modified: path/to/file.cpp/.h 1) change1 2) change2 etc...... modified: path/to/file.cpp/.h 1) change1 2) change2 etc......

Read the article
What is the best way to implement paginated text editing in Python?

- by W.F

I'm trying to build a formatted text editor in python. I need the editor to be paginated on edit mode. Same as in all popular word processors - when the user is editing the document what he/she sees is a representation of the actual, physical, page. I've tried looking into PySide but I can't find any ready solution to this, nor I can work out a way to do it myself. I am totally open to new technologies, so if you think Python is not the right choice here I would love to hear about new stuff (especially when I'm this new to UI coding). It only needs to be cross-platform and let me do rapid development (hence me looking for an out-of-the-box solution to this). Please suggest the best way to implement this. Please also note that I am looking for either a ready solution or an advice on how to tackle this. Thank you very much !

Read the article
Oracle BPM and Open Data integration development

- by drrwebber

Rapidly developing Oracle BPM application solutions with data source integration previously required significant Java and JDeveloper skills. Now using open source tools for open data development significantly reduces the coding needed. Key tasks can be performed with visual drag and drop designing combined with menu selections entry and automatic form generation directly from XSD schema definitions. The architecture used is extremely lightweight, portable, open platform and scalable allowing integration with a variety of Oracle and non-Oracle data sources and systems. Two videos available on YouTube walk through the process at both an introductory conceptual level and then a deep dive into the programming needed using JDeveloper, Oracle BPM composer and Oracle WLS (WebLogic Server) along with the CAM editor and Open-XDX open source tools. Also available are coding samples and resources from the GitHub project page, along with working online demonstration resources on the VerifyXML site. Combining Oracle BPM with these open source tools provides a comprehensive simple and elegant solution set. Development times are slashed and rapid prototyping is enabled. Also existing data sources can be integrated using open data formats with either XML or JSON along with CRUD accessing via the Open-XDX Java component. The Open-XDX tool is a code-free approach where data mapping is configured as templates using visual drag and drop in the CAM Editor open source tool. XML or JSON is then automatically generated or processed (output or input) and appropriate SQL statements created to support the data accessing. Also included is the ability to integrate with fillable PDF forms via the XML templates and the Java PDF form filling library. Again minimal Java coding is needed to associate the XML source content with the PDF named fields. The Oracle BPM forms can be automatically generated from XSD schema definitions that are built from the data mapping templates. This dramatically simplifies development work as all the integration artifacts needed are created by the open source editor toolset. The developer level video is designed as a tutorial with segments, hands-on demonstrations and reviews. This allows developers to learn the techniques and approaches used in incremental steps. The intended audience ranges from data analysts to developers and assumes only entry level Java skills and knowledge. Most actions are menu driven while Java coding is limited to simply configuring values and parameters along with performing builds and deployments from JDeveloper and Oracle WLS. Additional existing Oracle online training resources can be referenced on Oracle BPM and WLS that cover other normal delivery aspects such as user management and application deployment.

Read the article
How to reduce tight coupling between two data sources

- by fstuijt

I'm having some trouble finding a proper solution to the following architecture problem. In our setting (sketched below) we have 2 data sources, where data source A is the primary source for items of type Foo. A secondary data source exists which can be used to retrieve additional information on a Foo; however this information does not always exist. Furthermore, data source A can be used to retrieve items of type Bar. However, each Bar refers to a Foo. The difficulty here is that each Bar should refer to a Foo which, if available, also contains the information as augmented by data source B. My question is: how to remove the tight coupling between data source A and B?

Read the article
Resize images to specific height value in ImageMagick?

- by Jason

I've looked around for this, and can't find an easily implemented solution. Currently I'm working on an application that deals with panoramas. As they come out of the batch stitch process, the dimensions average 18000x4000. Using ImageMagick, how can I downscale those images to a specific height value while maintaining aspect ratio? According to the manual, the convert operation takes in both height and width to resize to while maintaining the same aspect ratio. What I'd like is to put in 600 and 1000 in my existing resize script function and have both a regular viewable image as well as a reduced size.

Read the article
How do I throttle a command in a terminal window?

- by To Do

I needed to run convert with a lot of images at the same time. The command took quite a while but this doesn't bother me. The issue is that this command rendered my computer unusable while the command was running (for about 15 minutes). So is it possible to throttle the command by limiting resources (processor and memory) to the command, directly from the command line? This can only work if I add something to the same line before pressing Enter because once I start the process the computer slows so much that it is impossible for example to switch to "System monitor" and reduce priority. Edit: top and iotop results I managed to run top and sudo iotop >iotop.txt while doing one of these convert operations. (The iotop.txt file produced is difficult to read) Results of top: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14275 username 20 0 4043m 3.0g 1448 D 7.0 80.4 0:16.45 convert Results of iotop: [?1049h[1;24r(B[m[4l[?7h[?1h=[39;49m[?25l[39;49m(B[m[H[2JTotal DISK READ: 1269.04 K/s | Total DISK WRITE:[59G0.00 B/s (B[0;7m TID PRIO USER DISK READ DISK WRITE SWAPIN(B[0;1;7m IO(B[0;7m COMMAND [3;2H(B[m2516 be/4 username 350.08 K/s 0.00 B/s 0.00 % 0.00 % zeitgeist-datahub 7394 be/4 username 568.88 K/s 0.00 B/s 77.41 % 0.00 % --rendere~.530483991[5;1H14275 idle username 350.08 K/s 0.00 B/s 37.49 % 0.00 % convert S~f test.pdf[6;2H2048 be/4 root[6;24H0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/3:2] [5G1 be/4 root[7;24H0.00 B/s 0.00 B/s 0.00 % 0.00 % init Furthermore, even after the process ends, the computer does not return to the previous performance. I found a way around this by running sudo swapoff -a followed by sudo swapon -a

Read the article
How to make a text search template?

- by Flipper

I am not really sure what to call this, but I am looking for a way to have a "template" for my code to go by when searching for text. I am working on a project where a summary for a piece of text is supplied to the user. I want to allow the user to select a piece of text on the page so that the next time they come across a similar page I can find the text. For instance, lets say somebody goes to foxnews.com and selects the article like in the image below. Then whenever they go to any other foxnews.com article I would be able to identify the text for the article and summarize it for them. But an issue I see with this is for a site like Stack Exchange where you have multiple comments to be selected (like below) which means that I would have to be able to recursively search for all separate pieces of text. Requirements Be able to keep pieces of text separate from each other. Possible Issues DIV's may not contain ids, classes, or names. A piece of text may span across multiple DIVs How to recognize where an old piece of text ends and a new begins. How to store this information for later searching?

Read the article
Powerful Lessons in Data from the Presidential Election

- by Christina McKeon

Now that we’ve had a few days to recover from the U.S. presidential election, it’s a good time to take a step back from politics and look for the customer experience lessons that we can take away. The most powerful lesson is that when you know more about your base, you will have an advantage over your competition. That advantage will translate into you winning and your competition losing. Michael Scherer of TIME was given access to Obama’s data analysts two days before the election. His account is documented in Inside the Secret World of the Data Crunchers Who Helped Obama Win. What we learned from Scherer’s inside view is how well Obama’s team did in getting the right data, analyzing it, and acting on it. This data team recognized how critical it was to break down data silos within the campaign. As Scherer noted, they created “a single system that merged information from pollsters, fundraisers, field workers, consumer databases, and social-media and mobile contacts with the main Democratic voter files in the swing states.” The Obama analysis was so meticulous that they knew which celebrity and which type of celebrity event would help them maximize campaign contributions. With a single system, their data models became more precise. They determined which messages were more successful with specific demographic groups and that who made the calls mattered. Data analysis also led to many other changes in Obama’s campaign including a new ad buying strategy, using social media and applications to tap into supporters’ friends, and using new social news sites. While we did not have that same inside view into Romney’s campaign, much of the post-mortem coverage indicates that Romney’s team did not have the right analysis. As Peter Hamby of CNN wrote in Analysis: Why Romney Lost, “Romney officials had modeled an electorate that looked something like a mix of 2004 and 2008….” That historical data did not account for the changing demographics in the U.S. Does your organization approach data like the Obama or Romney team? Do you really know your base? How well can you predict what is going to happen in your business? If you haven’t already put together a strategy and plan to know more, this week’s civics lesson is a powerful reason to do it sooner rather than later. Your competitors are probably thinking the same thing that you are!

Read the article
No proper kmeans clustering of images in matlab

- by user3237134

I am having 1200 face images in my training set.There are 2989 test face images. I am using eigen faces (PCA) for feature extraction. I am using kmeans clustering. Source code I tried: IDX = kmeans(z,5); clustercount=accumarray(IDX, ones(size(IDX))); disp(clustercount); Problem: Images are not clustered properly. Same faces should be clustered. But different faces are being clustered. Questions: Should I have to use still more face images for training? How accuracy of clustering can be achieved? What is the solution?

Read the article
Video: Analyzing Big Data using Oracle R Enterprise

- by Sherry LaMonica

Learn how Oracle R Enterprise is used to generate new insight and new value to business, answering not only what happened, but why it happened. View this YouTube Oracle Channel video overview describing how analyzing big data using Oracle R Enterprise is different from other analytics tools at Oracle. Oracle R Enterprise (ORE), a component of the Oracle Advanced Analytics Option, couples the wealth of analytics packages in R with the performance, scalability, and security of Oracle Database. ORE executes base R functions transparently on database data without having to pull data from Oracle Database. As an embedded component of the database, Oracle R Enterprise can run your R script and open source packages via embedded R where the database manages the data served to the R engine and user-controlled data parallelism. The result is faster and more secure access to data. ORE also works with the full suite of in-database analytics, providing integrated results to the analyst.

Read the article
Possible applications of algorithm devised for differentiating between structured vs random text

- by rooznom

I have written a program that can rapidly (within 5 sec on a 2GB RAM desktop, 2.33 Ghz CPU) differentiate between structured text (e.g english text) and random alphanumeric strings. It can also provide a probability score for the prediction. Are there any practical applications/uses of such a program. Note that the program is based on entropy models and does not have any dictionary comparisons in its workflow. Thanks in advance for your responses

Read the article
Effective and simple matching for 2 unequal small-scale point sets

- by Pavlo Dyban

I need to match two sets of 3D points, however the number of points in each set can be different. It seems that most algorithms are designed to align images and trimmed to work with hundreds of thousands of points. My case are 50 to 150 points in each of the two sets. So far I have acquainted myself with Iterative Closest Point and Procrustes Matching algorithms. Implementing Procrustes algorithms seems like a total overkill for this small quantity. ICP has many implementations, but I haven't found any readily implemented version accounting for the so-called "outliers" - points without a matching pair. Besides the implementation expense, algorithms like Fractional and Sparse ICP use some statistics information to cancel points that are considered outliers. For series with 50 to 150 points statistic measures are often biased or statistic significance criteria are not met. I know of Assignment Problem in linear optimization, but it is not suitable for cases with unequal sets of points. Are there other, small-scale algorithms that solve the problem of matching 2 point sets? I am looking for algorithm names, scientific papers or C++ implementations. I need some hints to know where to start my search.

Read the article
Applying the Knuth-Plass algorithm (or something better?) to read two books with different length and amount of chapters in parallel

- by user147133

I have a Bible reading plan that covers the whole Bible in 180 days. For the most of the time, I read 5 chapters in the Old Testament and 1 or 2 (1.5) chapters in the New Testament each day. The problem is that some chapters are longer than others (for example Psalm 119 which is 7 times longer than a average chapter in the Bible), and the plan I'm following doesn't take that in count. I end up with some days having a lot more to read than others. I thought I could use programming to make myself a better plan. I have a datastructure with a list of all chapters in the bible and their length in number of lines. (I found that the number of lines is the best criteria, but it could have been number of verses or number of words as well) I then started to think about this problem as a line wrap problem. Think of a chapter like a word, a day like a line and the whole plan as a paragraph. The "length" of a word (a chapter) is the number of lines in that chapter. I could then generate the best possible reading plan by applying a simplified Knuth-Plass algorithm to find the best breakpoints. This works well if I want to read the Bible from beginning to end. But I want to read a little from the new testament each day in parallel with the old testament. Of course I can run the Knuth-Plass algorithm on the Old Testament first, then on the New Testament and get two separate plans. But those plans merged is not a optimal plan. Worst-case days (days with extra much reading) in the New Testament plan will randomly occur on the same days as the worst-case days in the Old Testament. Since the New Testament have about 180*1.5 chapters, the plan is generally to read one chapter the first day, two the second, one the third etc... And I would like the plan for the Old Testament to compensate for this alternating length. So I will need a new and better algorithm, or I will have to use the Knuth-Plass algorithm in a way that I've not figured out. I think this could be a interesting and challenging nut for people interested in algorithms, so therefore I wanted to see if any of you have a good solution in mind.

Read the article
How can I "bulk paste" a clipboard string of multi-line text into a readable ordered list?

- by gunshor

How can I "bulk paste" a clipboard string of multi-line text into a readable ordered list? I'm trying to demonstrate how to turn any string of multi-line text into an ordered list. The script (preferably JS) needs to respect: - carriage returns at the end of a line, to mean "that line ends here" - indentations at the beginning of a line, to mean "this is part of the item above it" - dashes at the beginning of a line, to mean "this is a task, and the line above it is its project"

Read the article
Would opencv be a good choice for image colour summarization?

- by codecowboy

I would like to analyse a set of hundreds of thousands of product images (clothing, electronic goods etc) and retrieve the dominant colours in each. I'm only interested in the top 3 or 4 colours. The aim is to achieve a degree of certainty that x image is mostly red or image y is mostly orange and blue. The images are likely to be colour jpegs of reasonable quality and approximately 100kb in size. I would like to use C# and the solution should run on a Linux server, preferably using open source libraries. Would opencv be a good choice for this? What other libraries or specific algorithms might be helpful?

Read the article
What is the simplest way for a slippy SVG visualization?

- by totymedli

I have a big SVG file representing a complicated graph with hundreds of points. I want to represent this in a web page. My idea was that I could make it like Google Maps represent their maps, in those slippy, dragable, moveable maps. I'am looking for an easy and fast JavaScript library which could do the work. What I need for my "map" is the drag/move, zoom ability, and some way to click on the points of the picture, which makes a little information apear about that point, like Google maps markers. I'am looking for a free/open source library. I saw some solutions but I'am uncertain about them, and none of them seemed to be perfet: Polymaps - I love the technique it uses, but I don't know much about this library. Leaflet - I love the simplicity of it, but I dont know how could I apply it for my SVG. Raphael - I heard the awesomeness of this, but It seemed a lots of work to do this task. What would be the best/easiest solution for my problem, and what is your opinion aboute the above libraries?

Read the article
What Shading/Rendering techniques are being used in this image?

- by Rhakiras

My previous question wasn't clear enough. From a rendering point of view what kind of techniques are used in this image as I would like to apply a similar style (I'm using OpenGL if that matters): http://alexcpeterson.com/ My specific questions are: How is that sun glare made? How does the planet look "cartoon" like? How does the space around the planet look warped/misted? How does the water look that good? I'm a beginner so any information/keywords on each question would be helpful so I can go off and learn more. Thanks

Read the article
Display large amount of data to client through pagination

- by ebram tharwat

I have a web application in which i need to show a big number of data or records for clients. Now i 'll use pagination but i was wondering should I: Load all the data once then pagination, sorting and sarching 'll be easy..But it 'll takes big time(using local DB it takes up to 9 sec.) Or each time i show new page(from the pagination) i make a new request to server and then new request to DB to get the next records..But then what if the client click on Prev button, i 'll make a new request to get data that I had previously..Should i cach data that are loaded before and how if that's good technique? So load all data once or make a new request every time i need data that maybe have been loaded before. I'm using ASP.NET MVC SPA with durandaljs and knockoutjs

Read the article

< Previous Page | 80 81 82 83 84 85 86 87 88 89 90 91 | Next Page >