Search Results

Search found 58440 results on 2338 pages for 'data cleansing'.

Page 36/2338 | < Previous Page | 32 33 34 35 36 37 38 39 40 41 42 43 | Next Page >

Guessing Excel Data Types

- by AjarnMark

Note to Self HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel: TypeGuessRows = 0 means scan everything. Note to Others About 10 years ago I stumbled across this bit of information just when I needed it and it saved my project. Then for some reason, a few years later when it would have been nice, but not critical, for some reason I could not find it again anywhere. Well, now I have stumbled across it again, and to preserve my future self from nightmares and sudden baldness due to pulling my hair out, I have decided to blog it in the hopes that I can find it again this way. Here’s the story… When you query data from an Excel spreadsheet, such as with old-fashioned DTS packages in SQL 2000 (my first reference) or simply with an OLEDB Data Adapter from ASP.NET (recent task) and if you are using the Microsoft Jet 4.0 driver (newer ones may deal with this differently) then you can get funny results where the query reports back that a cell value is null even when you know it contains data. What happens is that Excel doesn’t really have data types. While you can format information in cells to appear like certain data types (e.g. Date, Time, Decimal, Text, etc.) that is not really defining the cell as being of a certain type like we think of when working with databases. But, presumably, to make things more convenient for the user (programmer) when you issue a query against Excel, the query processor tries to guess what type of data is contained in each column and returns it in an appropriate manner. This is all well and good IF your data is consistent in every row and matches what the processor guessed. And, for efficiency’s sake, when the query processor is trying to figure out each column’s data type, it does so by analyzing only the first 8 rows of data (default setting). Now here’s the problem, suppose that your spreadsheet contains information about clothing, and one of the columns is Size. Now suppose that in the first 8 rows, all of your sizes look like 32, 34, 18, 10, and so on, using numbers, but then, somewhere after the 8th row, you have some rows with sizes like S, M, L, XL. What happens is that by examining only the first 8 rows, the query processor inferred that the column contained numerical data, and then when it hits the non-numerical data in later rows, it comes back blank. Major bummer, and a real pain to track down if you don’t know that Excel is doing this, because you study the spreadsheet and say, “the data is RIGHT THERE! WHY doesn’t the query see it?!?!” And the hair-pulling begins. So, what’s a developer to do? One option is to go to the registry setting noted above and change the DWORD value of TypeGuessRows from the default of 8 to 0 (zero). Setting this value to zero will force Jet to scan every row in the spreadsheet before making its determination as to what type of data the column contains. And that means that in the example above, it would have treated the column as a string rather than as numeric, and presto! your query now returns all of the values that you know are in there. Of course, there is a caveat… if you are querying large spreadsheets, making Jet scan every row can be quite a performance hit. You could enter a different number (more than 8) that you believe is a better sampling of rows to make the guess, but you still have the possibility that every row scanned looks alike, but that later rows are different, and that you might get blanks when there really is data there. That’s the type of gamble, I really don’t like to take with my data. Anyone with a better approach, or with experience with more recent drivers that have a better way of handling data types, please chime in!

Read the article
Using Oracle ADF Data Visualization Tools (DVT) Line Graphs to Display Weather Information

- by Christian David Straub

OverviewA guest post by Jeanne Waldman.I have a simple JDeveloper Fusion application that retrieves weather data. I wanted to compare the week's temperatures of different locations in a graph. I decided to check out the dvt:lineGraph component, and it took me a few minutes to add it to my jspx page and supply it with data.Drag and Drop the dvt:lineGraph onto your pageI opened my .jspx page in design modeIn the Component Palette, I selected ADF Data Visualization.Then I dragged 'Line' onto my page.A dialog popped up giving me options of the type of line graph. I chose the default.A lineGraph displayed with some default data. Hook up your weather dataNow I wanted to hook up my own data. I browsed the tagdoc, and I found the tabularData attribute.Attribute: tabularDataType: java.util.ListTagDoc:Specifies a list of data that the graph uses to create a grid and populate itself. The List consists of a three-member Object array for each data value to be passed to the graph. The members of each array must be organized as follows: The first member (index 0) is the column label, in the grid, of the data value. This is generally a String. If the graph has a time axis, then this should be a Java Date. Column labels typically identify groups in the graph. The second member (index 1) is the row label, in the grid, of the data value. This is generally a String. Row labels appear as series labels in the graph (usually in the legend). The third member (index 2) is the data value, which is usually a Double.The first member is the column label of the data value. This would be the day of the week.The second member is the row label of the data value. This would be the location name.The third member is the data value, usually a Double. This would be the temperature. I already had all this information, I just needed to put it in a List with a three-member Object array for each data value. /** * This is used for the lineGraph to show the data for each location. */ public List<Object[]> getTabularData() { List<Object[]> tabularData = new ArrayList<Object []>(); List<WeatherForecast> weatherForecastList = getWeatherForecastList(); // loop through the list and build up the tabular data. Then cache it. for(WeatherForecast wf : weatherForecastList) { List<ForecastDay> forecastDayList = wf.getForecastDayList(); String location = wf.getLocation(); for (ForecastDay fday : forecastDayList) { String day = fday.getPrettyDate(); String highTemp = fday.getHighF(); tabularData.add(new Object[]{day, location, Double.valueOf(highTemp)}); } } return tabularData; } Now I bound the lineGraph to this method by setting tabularData to#{weatherForAllLocationsBean.tabularData}weatherForAllLocationsBean is my bean that is defined in faces-config.xml. Adding a barGraphIn about 30 seconds, I added a barGraph with the same data. I dragged and dropped a bar graph onto the page, used the same tabularData as I did in the line graph. The page looks like this: ConclusionI was very happy how fast it was to hook up my weather data to these graphs. They look great, and they have built in functionality. For instance, I can hide/show a location by clicking on the name of the location in the legend.

Read the article
As a game developer, which data structure use for develop the game? [duplicate]

- by Rizwanabbasi

This question already has an answer here: When should vector/list be used? 5 answers We are developing a game for bank robbery. The game plots a bank robbery. Lots of people witness that robbery. Our game will load the lists of suspected offenders while the players (witnesses) will have to identify the offenders of this robbery. Game should load list of offenders to identify the one as quickly as possible. Admin can add/remove offenders in the lists and two or more lists of offenders can also be merged into one (to show it to the player). As a game developer, which data structure we should use for develop the game? Justify your selection with solid arguments. Remember the most critical requirement is that the list should load super fast.

Read the article
Is using MultiMaps code smell? If so what alternative data structures fit my needs?

- by Pureferret

I'm trying to model nWoD characters for a roleplaying game in a character builder program. The crux is I want to support saving too and loading from yaml documents. One aspect of the character's is their set of skills. Skills are split between exactly three 'types': Mental, Physical, and Social. Each type has a list of skills under if. My Yaml looks like this: PHYSICAL: Athletics: 0 Brawl: 3 MENTAL: Academics: 2 Computers My initial thought was to use a Multimap of some sort, and have the skill type as an Enum, and key to my map. Each Skill is an element in the collection that backs the multimap. However, I've been struggling to get the yaml to work. On explaining this to a colleague outside of work they said this was probably a sign of code smell, and he's never seen it used 'well'. Are multiMaps really code smell? If so what alternate data structures would suit my goals?

Read the article
How to find data usage of a user on my website?

- by Dharmik

I have a website (project) where users get logged in, do their work and then they log out. I need to build a report that displays how much each person has used of data. (bandwidth, how much was downloaded in Kb, etc) So the process may be like counting start of usage from user login to user logout. I have seen a little about Webalizer and AWStats for something like this, But I am not sure how they work. I have tried Content-Length but some pages don't send content-length.I have also seen mod_bandwidth but still I am little confused. This process is needed for my site because now, our company is thinking of charging per usage and also bandwidth allocation for each users (according to their membership). I haven't worked with this type of tools, I am newbie in this matter. I have done only simple websites not any setting like this in Apache or Linux. My project is in Codeigniter.

Read the article
Configuration data: single-row table vs. name-value-pair table

- by Heinzi

Let's say you write an application that can be configured by the user. For storing this "configuration data" into a database, two patterns are commonly used. The single-row table CompanyName | StartFullScreen | RefreshSeconds | ... ---------------+-------------------+------------------+-------- ACME Inc. | true | 20 | ... The name-value-pair table ConfigOption | Value -----------------+------------- CompanyName | ACME Inc. StartFullScreen | true (or 1, or Y, ...) RefreshSeconds | 20 ... | ... I've seen both options in the wild, and both have obvious advantages and disadvantages, for example: The single-row tables limits the number of configuration options you can have (since the number of columns in a row is usually limited). Every additional configuration option requires a DB schema change. In a name-value-pair table everything is "stringly typed" (you have to encode/decode your Boolean/Date/etc. parameters). (many more) Is there some consensus within the development community about which option is preferable?

Read the article
Data Import Resources for Release 17

- by Pete

With Release 17 you now have three ways to import data into CRM On Demand: The Import Assistant Oracle Data Loader On Demand, a new, Java-based, command-line utility with a programmable API Web Services We have created the Data Import Options Overview document to help you choose the method that works best for you. We have also created the Data Import Resources page as a single point of reference. It guides you to all resources related to these three import options. So if you're looking for the Data Import Options Overview document, the Data Loader Overview for Release 17, the Data Loader User Guide, or the Data Loader FAQ, here's where you find them: On our new Training and Support Center, under the Learn More tab, go to the What's New section and click Data Import Resources.

Read the article
Rails Easy Data Dumping

- by Madhan ayyasamy

Hi Friends,The following useful snippets,you can find out the easiest way of ruby on rails environment data dumping. You’ll often need to get data from production to dev or dev to your local or your local to another developer’s local. One plug-in we use over and over is Yaml_db. This nifty little plug-in enables you to dump or load data by issuing a Rake command. The data is persisted in a yaml file located in db/data.yml. This is very portable and easy to read if you need to examine the data.01rake db:data:dump02 03example data found in db/data.yml04 05---06campaigns:07 columns:08 - id09 - client_id10 - name11 - created_at12 - updated_at13 - token14 records:15 - - "1"16 - "1"17 - First push18 - 2008-11-03 18:23:5319 - 2008-11-03 18:23:5320 - 3f2523f6a66521 - - "2"22 - "2"23 - First push24 - 2008-11-03 18:26:5725 - 2008-11-03 18:26:5726 - 9ee8bc427d94

Read the article
Transmitting Form Data from the Client to the Web Server

The steps involved in transmitting form data from the client to the web server User loads web form User enters data in to web form fields User clicks submit On submit page validates fields using JavaScript. If validation errors are found then the validation script stops the browser from canceling posting the data to the web server and displays error messages as needed If the form passes the data validation process then the browser will URL encode the values of every field and post it to the server. The server reads the posted data from the query string and then again validates the data just to ensure data consistency and to prevent any non-validated data because JavaScript was turned off on the clients browser from being inserted in to a database or passed on to other process If the data passes the second validation check then the server side code will continue with the requested processes

Read the article
Which data structure you will use to for a witness list?

- by mateen

I'm making a game where the plot is a bank robbery. Lots of people witness that robbery. The game will load a list of suspects, while the players (witnesses) will have to identify the suspects of this robbery. The game should load a list of suspects to identify the one as quickly as possible. Admin can add/remove suspects in the lists and two or more lists of suspects can also be merged into one (to show it to the player). The question is which data structure will be suitable to develop the lists?

Read the article
SQL Peer-to-Peer Dynamic Structured Data Processing Collaboration

Unstructured and XML semi-structured data is now used more than structured data. But fixed structured data still keeps businesses running day in and day out, which requires consistent predictable highly principled processing for correct results. For this reason, it would be very useful to have a general purpose SQL peer-to-peer collaboration capability that can utilize highly principled hierarchical data processing and its flexible and advanced structured processing to support dynamically structured data and its dynamic structured processing. This flexible dynamic structured processing can change the structure of the data as necessary for the required processing while preserving the relational and hierarchical data principles. This processing will perform freely across remote unrelated peer locations anytime and transparently process unpredictable and unknown structured data and data type changes automatically for immediate processing using automatic metadata maintenance.

Read the article
Is there a data structure for this type of list/map?

- by Nick

Perhaps there's a name for what I want, but I'm not aware of it. I need something similar to a LinkedHashMap in Java, but where it returns the 'previous' value if there's no value at the specified key. That is, I have a list of objects stored by an integer key (which is in units of time in my case): ; key->value 10->A 15->B 20->C So, if I were to query for a value for key 0-9, it would return null. The special part is if I queried for something 10 <= i <= 14 it would return A. Or, for i = 20, it would return C. Is there a data structure for this?

Read the article
On a failing hard drive, I am able to view data but unable to copy it - why?

- by Tom

I have a 2.5" external hard drive that is failing. It's not making the expected 'clicking' noise that most hard drives and I am able to view the data, but I am unable to actually retrieve the data. I attempted to use SpinRite in order to access the data on the drive, but it didn't like the external drive. When I view the drive's property page, the drive shows that it's used space is at 100% and that it has 0 bytes available; however, the progress indicator under the drive icon in Windows Explorer shows that it's roughly 50% full (which is correct). When I attempt to run Windows' "Error Checking" tool and attempt to "scan for an attempt recovery of bad sectors," the tool begins to run then immediately closes with no error message. I am able to browse the contents of the drive using Windows Explorer. When I begin to try copying any given single file, the copy process begins, an indicator starts, and then the copy fails with no real error message. The Disk Management page in Computer Management under Control Panel also shows this drive has being 'Healthy.' I dropped the drive off at a data recovery store and they said that "The data seems to be intact, but an internal failure is preventing any information from being retrieved." They offered to provide me references to a data recovery specialist. I've also attempted to run CHKDSK on the drive (with and without arguments) but it returns the following error: The type of the filesystem is RAW. CHKDSK is not available for RAW drives. Before going the route of more expensive data recovery, I'm wondering if these symptoms sound familiar to anyone? Other questions... I'm willing to continue trying tools such as TestDisk and/or PhotoRec (as the majority of the data that I'd like to salvage are photos) but how long I should expect either tool to run given approximately 400GB of data? I'm also comfortable using Linux so I welcome any suggestions for utilities or tools and strategies with which you've had success.

Read the article
Is there any chance that my data will get silently corrupted with a robocopy SMB network transfer?

- by Archagon

I'm setting up a NAS box for the first time. At the moment, I have most of my data backed up to a few local hard drives, and I intend to transfer all the data to my NAS over ethernet once the RAID array is setup. Since this is all happening over the network, I'm a bit worried about my data getting corrupted silently during transfer. From what I understand, data generally doesn't get corrupted without notice on local transfers because a checksum is performed at some point by the drive or the OS. (This could be totally wrong.) Does the same thing happen with SMB, or is it up to the transferrer to check the integrity of their data? And if it doesn't happen with SMB, is there a protocol that does ensure data integrity? I know that rsync can checksum a transfer, but I'm on Windows and I already have a robocopy configuration that I like. Will my data be safe or do I have to use an external checksum tool to make sure?

Read the article
Freezes (not crashes) with GCD, blocks and Core Data

- by Lukasz

I have recently rewritten my Core Data driven database controller to use Grand Central Dispatch to manage fetching and importing in the background. Controller can operate on 2 NSManagedContext's: NSManagedObjectContext *mainMoc instance variable for main thread. this contexts is used only by quick access for UI by main thread or by dipatch_get_main_queue() global queue. NSManagedObjectContext *bgMoc for background tasks (importing and fetching data for NSFetchedresultsController for tables). This background tasks are fired ONLY by user defined queue: dispatch_queue_t bgQueue (instance variable in database controller object). Fetching data for tables is done in background to not block user UI when bigger or more complicated predicates are performed. Example fetching code for NSFetchedResultsController in my table view controllers: -(void)fetchData{ dispatch_async([CDdb db].bgQueue, ^{ NSError *error = nil; [[self.fetchedResultsController fetchRequest] setPredicate:self.predicate]; if (self.fetchedResultsController && ![self.fetchedResultsController performFetch:&error]) { NSSLog(@"Unresolved error in fetchData %@", error); } if (!initial_fetch_attampted)initial_fetch_attampted = YES; fetching = NO; dispatch_async(dispatch_get_main_queue(), ^{ [self.table reloadData]; [self.table scrollRectToVisible:CGRectMake(0, 0, 100, 20) animated:YES]; }); }); } // end of fetchData function bgMoc merges with mainMoc on save using NSManagedObjectContextDidSaveNotification: - (void)bgMocDidSave:(NSNotification *)saveNotification { // CDdb - bgMoc didsave - merging changes with main mainMoc dispatch_async(dispatch_get_main_queue(), ^{ [self.mainMoc mergeChangesFromContextDidSaveNotification:saveNotification]; // Extra notification for some other, potentially interested clients [[NSNotificationCenter defaultCenter] postNotificationName:DATABASE_SAVED_WITH_CHANGES object:saveNotification]; }); } - (void)mainMocDidSave:(NSNotification *)saveNotification { // CDdb - main mainMoc didSave - merging changes with bgMoc dispatch_async(self.bgQueue, ^{ [self.bgMoc mergeChangesFromContextDidSaveNotification:saveNotification]; }); } NSfetchedResultsController delegate has only one method implemented (for simplicity): - (void)controllerDidChangeContent:(NSFetchedResultsController *)controller { dispatch_async(dispatch_get_main_queue(), ^{ [self fetchData]; }); } This way I am trying to follow Apple recommendation for Core Data: 1 NSManagedObjectContext per thread. I know this pattern is not completely clean for at last 2 reasons: bgQueue not necessarily fires the same thread after suspension but since it is serial, it should not matter much (there is never 2 threads trying access bgMoc NSManagedObjectContext dedicated to it). Sometimes table view data source methods will ask NSFetchedResultsController for info from bgMoc (since fetch is done on bgQueue) like sections count, fetched objects in section count, etc.... Event with this flaws this approach works pretty well of the 95% of application running time until ... AND HERE GOES MY QUESTION: Sometimes, very randomly application freezes but not crashes. It does not response on any touch and the only way to get it back to live is to restart it completely (switching back to and from background does not help). No exception is thrown and nothing is printed to the console (I have Breakpoints set for all exception in Xcode). I have tried to debug it using Instruments (time profiles especially) to see if there is something hard going on on main thread but nothing is showing up. I am aware that GCD and Core Data are the main suspects here, but I have no idea how to track / debug this. Let me point out, that this also happens when I dispatch all the tasks to the queues asynchronously only (using dispatch_async everywhere). This makes me think it is not just standard deadlock. Is there any possibility or hints of how could I get more info what is going on? Some extra debug flags, Instruments magical tricks or build setting etc... Any suggestions on what could be the cause are very much appreciated as well as (or) pointers to how to implement background fetching for NSFetchedResultsController and background importing in better way.

Read the article
Tile Engine - Procedural generation, Data structures, Rendering methods - A lot of effort question!

- by Trixmix

Isometric Tile and GameObject rendering. To achive the desired looking game I need to take into consideration which tiles need to be drawn first and which last. What I used is a Object that is TileRenderQueue that you would give it a tile list and it will give you a queue on which ones to draw based on their Z coordinate, so that if the Z is higher then it needs to be drawn last. Now if you read above you would know that I want the location data to instead of being stored in the tile instance i want it to be that the index in the array is the location. and then maybe based on the array i could draw the tiles instead of taking a long time in for looping and ordering them by Z. This is the hardest part for me. It's hard for me to find a simple solution to the which one to draw when problem. Also there is the fact that if the X is larger than the gameobject where the X is larger needs to be drawn over the rest of the tiles and so on. Here is an example: All the parts work together to create an efficient engine so its important to me that you would answer all of the parts. I hope you will work on the answers hard just as much that I worked on this question! If there is any unclear part tell me so in the comments! Thanks for the help!

Read the article
NEW 2-Day Instructor Led Course on Oracle Data Mining Now Available!

- by chberger

A NEW 2-Day Instructor Led Course on Oracle Data Mining has been developed for customers and anyone wanting to learn more about data mining, predictive analytics and knowledge discovery inside the Oracle Database. Course Objectives: Explain basic data mining concepts and describe the benefits of predictive analysis Understand primary data mining tasks, and describe the key steps of a data mining process Use the Oracle Data Miner to build,evaluate, and apply multiple data mining models Use Oracle Data Mining's predictions and insights to address many kinds of business problems, including: Predict individual behavior, Predict values, Find co-occurring events Learn how to deploy data mining results for real-time access by end-users Five reasons why you should attend this 2 day Oracle Data Mining Oracle University course. With Oracle Data Mining, a component of the Oracle Advanced Analytics Option, you will learn to gain insight and foresight to: Go beyond simple BI and dashboards about the past. This course will teach you about "data mining" and "predictive analytics", analytical techniques that can provide huge competitive advantage Take advantage of your data and investment in Oracle technology Leverage all the data in your data warehouse, customer data, service data, sales data, customer comments and other unstructured data, point of sale (POS) data, to build and deploy predictive models throughout the enterprise. Learn how to explore and understand your data and find patterns and relationships that were previously hidden Focus on solving strategic challenges to the business, for example, targeting "best customers" with the right offer, identifying product bundles, detecting anomalies and potential fraud, finding natural customer segments and gaining customer insight.

Read the article
How do I express subtle relationships in my data?

- by Chuck H

"A" is related to "B" and "C". How do I show that "B" and "C" might, by this context, be related as well? Example: Here are a few headlines about a recent Broadway play: 1 - David Mamet's Glengarry Glen Ross, Starring Al Pacino, Opens on Broadway 2 - Al Pacino in 'Glengarry Glen Ross': What did the critics think? 3 - Al Pacino earns lackluster reviews for Broadway turn 4 - Theater Review: Glengarry Glen Ross Is Selling Its Stars Hard 5 - Glengarry Glen Ross; Hey, Who Killed the Klieg Lights? Problem: Running a fuzzy-string match over these records will establish some relationships, but not others, even though a human reader could pick them out from context in much larger datasets. How do I find the relationship that suggests #3 is related to #4? Both of them can be easily connected to #1, but not to each other. Is there a (Googlable) name for this kind of data or structure? What kind of algorithm am I looking for? Goal: Given 1,000 headlines, a system that automatically suggests that these 5 items are all probably about the same thing. To be honest, it's been so long since I've programmed I'm at a loss how to properly articulate this problem. (I don't know what I don't know, if that makes sense). This is a personal project and I'm writing it in Python. Thanks in advance for any help, advice, and pointers!

Read the article
Hidden Gems: Accelerating Oracle Data Integrator with SOA, Groovy, SDK, and XML

- by Alex Kotopoulis

On the last day of Oracle OpenWorld, we had a final advanced session on getting the most out of Oracle Data Integrator through the use of various advanced techniques. The primary way to improve your ODI processes is to choose the optimal knowledge modules for your load and take advantage of the optimized tools of your database, such as OracleDataPump and similar mechanisms in other databases. Knowledge modules also allow you to customize tasks, allowing you to codify best practices that are consistently applied by all integration developers. ODI SDK is another very powerful means to automate and speed up your integration development process. This allows you to automate Life Cycle Management, code comparison, repetitive code generation and change of your integration projects. The SDK is easily accessible through Java or scripting languages such as Groovy and Jython. Finally, all Oracle Data Integration products provide services that can be integrated into a larger Service Oriented Architecture. This moved data integration from an isolated environment into an agile part of a larger business process environment. All Oracle data integration products can play a part in thisracle GoldenGate can integrate into business event streams by processing JMS queues or publishing new events based on database transactions. Oracle GoldenGate can integrate into business event streams by processing JMS queues or publishing new events based on database transactions. Oracle Data Integrator allows full control of its runtime sessions through web services, so that integration jobs can become part of business processes. Oracle Data Service Integrator provides a data virtualization layer over your distributed sources, allowing unified reading and updating for heterogeneous data without replicating and moving data. Oracle Enterprise Data Quality provides data quality services to cleanse and deduplicate your records through web services.

Read the article
What's the proper term for a function inverse to a constructor - to unwrap a value from a data type?

- by Petr Pudlák

Edit: I'm rephrasing the question a bit. Apparently I caused some confusion because I didn't realize that the term destructor is used in OOP for something quite different - it's a function invoked when an object is being destroyed. In functional programming we (try to) avoid mutable state so there is no such equivalent to it. (I added the proper tag to the question.) Instead, I've seen that the record field for unwrapping a value (especially for single-valued data types such as newtypes) is sometimes called destructor or perhaps deconstructor. For example, let's have (in Haskell): newtype Wrap = Wrap { unwrap :: Int } Here Wrap is the constructor and unwrap is what? The questions are: How do we call unwrap in functional programming? Deconstructor? Destructor? Or by some other term? And to clarify, is this/other terminology applicable to other functional languages, or is it used just in the Haskell? Perhaps also, is there any terminology for this in general, in non-functional languages? I've seen both terms, for example: ... Most often, one supplies smart constructors and destructors for these to ease working with them. ... at Haskell wiki, or ... The general theme here is to fuse constructor - deconstructor pairs like ... at Haskell wikibook (here it's probably meant in a bit more general sense), or newtype DList a = DL { unDL :: [a] -> [a] } The unDL function is our deconstructor, which removes the DL constructor. ... in The Real World Haskell.

Read the article
How to optimize Core Data query for full text search

- by dk

Can I optimize a Core Data query when searching for matching words in a text? (This question also pertains to the wisdom of custom SQL versus Core Data on an iPhone.) I'm working on a new (iPhone) app that is a handheld reference tool for a scientific database. The main interface is a standard searchable table view and I want as-you-type response as the user types new words. Words matches must be prefixes of words in the text. The text is composed of 100,000s of words. In my prototype I coded SQL directly. I created a separate "words" table containing every word in the text fields of the main entity. I indexed words and performed searches along the lines of SELECT id, * FROM textTable JOIN (SELECT DISTINCT textTableId FROM words WHERE word BETWEEN 'foo' AND 'fooz' ) ON id=textTableId LIMIT 50 This runs very fast. Using an IN would probably work just as well, i.e. SELECT * FROM textTable WHERE id IN (SELECT textTableId FROM words WHERE word BETWEEN 'foo' AND 'fooz' ) LIMIT 50 The LIMIT is crucial and allows me to display results quickly. I notify the user that there are too many to display if the limit is reached. This is kludgy. I've spent the last several days pondering the advantages of moving to Core Data, but I worry about the lack of control in the schema, indexing, and querying for an important query. Theoretically an NSPredicate of textField MATCHES '.*\bfoo.*' would just work, but I'm sure it will be slow. This sort of text search seems so common that I wonder what is the usual attack? Would you create a words entity as I did above and use a predicate of "word BEGINSWITH 'foo'"? Will that work as fast as my prototype? Will Core Data automatically create the right indexes? I can't find any explicit means of advising the persistent store about indexes. I see some nice advantages of Core Data in my iPhone app. The faulting and other memory considerations allow for efficient database retrievals for tableview queries without setting arbitrary limits. The object graph management allows me to easily traverse entities without writing lots of SQL. Migration features will be nice in the future. On the other hand, in a limited resource environment (iPhone) I worry that an automatically generated database will be bloated with metadata, unnecessary inverse relationships, inefficient attribute datatypes, etc. Should I dive in or proceed with caution?

Read the article
Dynamic data-entry value store

- by simendsjo

I'm creating a data-entry application where users are allowed to create the entry schema. My first version of this just created a single table per entry schema with each entry spanning a single or multiple columns (for complex types) with the appropriate data type. This allowed for "fast" querying (on small datasets as I didn't index all columns) and simple synchronization where the data-entry was distributed on several databases. I'm not quite happy with this solution though; the only positive thing is the simplicity... I can only store a fixed number of columns. I need to create indexes on all columns. I need to recreate the table on schema changes. Some of my key design criterias are: Very fast querying (Using a simple domain specific query language) Writes doesn't have to be fast Many concurrent users Schemas will change often Schemas might contain many thousand columns The data-entries might be distributed and needs syncronization. Preferable MySQL and SQLite - Databases like DB2 and Oracle is out of the question. Using .Net/Mono I've been thinking of a couple of possible designs, but none of them seems like a good choice. Solution 1: Union like table containing a Type column and one nullable column per type. This avoids joins, but will definitly use a lot of space. Solution 2: Key/value store. All values are stored as string and converted when needed. Also use a lot of space, and of course, I hate having to convert everything to string. Solution 3: Use an xml database or store values as xml. Without any experience I would think this is quite slow (at least for the relational model unless there is some very good xpath support). I also would like to avoid an xml database as other parts of the application fits better as a relational model, and being able to join the data is helpful. I cannot help to think that someone has solved (some of) this already, but I'm unable to find anything. Not quite sure what to search for either... I know market research is doing something like this for their questionnaires, but there are few open source implementations, and the ones I've found doesn't quite fit the bill. PSPP has much of the logic I'm thinking of; primitive column types, many columns, many rows, fast querying and merging. Too bad it doesn't work against a database.. And of course... I don't need 99% of the provided functionality, but a lot of stuff not included. I'm not sure this is the right place to ask such a design related question, but I hope someone here has some tips, know of any existing work, or can point me to a better place to ask such a question. Thanks in advance!

Read the article
Excel tables creation upon MySQL data import (new feature in MySQL for Excel 1.2.x)

- by Javier Treviño

In this blog post we are going to talk about one of the features included since MySQL for Excel 1.2.0, you can install the latest GA or maintenance version using the MySQL Installer or optionally you can download directly any GA or non-GA version from the MySQL Developer Zone. Remember how easy is to dump data from a MySQL table, view or stored procedure to an Excel worksheet? (If you don't you can check out this other post: How To - Guide to Importing Data from a MySQL Database to Excel using MySQL for Excel). In version 1.2.0 we introduced some advanced options for the Import MySQL Data operation regarding Excel tables. The Advanced Options dialog shown above is accessible from any Import Data dialog. When the Create an Excel table for the imported MySQL table data option is checked (which is by default), MySQL for Excel will create an Excel table (also known in Excel jargon as a ListObject) from the Excel range containing the imported MySQL data. This "little feature" enables the right-away usage of the Excel table in data analysis, like including it for summarization on a PivotTable, including a summarization row at the end of the table's data, sorting or filtering the table's data by clicking the drop-down button next to each column's header, among other actions. The Excel tables that are created automatically from imported MySQL data will have a name like [UserPrefix].<SchemaName>.<DbObjectName> for tables and views, and <Prefix>.<SchemaName>.<ProcedureName>.<ResultSetName> for stored procedures. Notice the first piece of the name is an optional [UserPrefix], the prefix is only used if the Prefix Excel tables with the following text option is checked, notice that the suggested prefix is "MySQL" but it can be changed to whatever text is suitable for you. Excel tables must have a table style so they are easily identified. There are a lot of predefined Excel table styles, by default the MySqlDefault style is applied, which is the style you have seen applied to imported data for Edit Sessions, and which adds simple and elegant formatting to the table. If you wish to change it to any of the predefined Excel table style you can do it through the drop-down list on the Use style [[styles drop-down]] for the new Excel table option. Excel tables are the basic construction blocks for building data analysis or self-service Business Intelligence using other more advanced Excel tools like Power Pivot, Power View or Power Map. This feature empowers imported MySQL data to use it in more advanced ways. We hope you give this and the other new features in the 1.2.x version family a try! Remember that your feedback is very important for us, so drop us a message and follow us: MySQL on Windows (this) Blog: https://blogs.oracle.com/MySqlOnWindows/ MySQL for Excel forum: http://forums.mysql.com/list.php?172 Facebook: http://www.facebook.com/mysql YouTube channel: https://www.youtube.com/user/MySQLChannel Cheers!

Read the article
How to recover data files from xampp-windows to xampp-linux after crash?

- by David Buehler

My Windows box died after I developed a database in xampp on it; fortunately I have a backup of the entire F:/TestWeb/Xampp partition. Unfortunately, I did not do an Export (nor dump) of the "Lws2" database before the crash. I have replaced the defunct machine with one running Mint7 (based on Ubuntu 9.04 "Jaunty Jackalope") and installed xampp-linux into the /opt partition, so the new xampp now runs fine in /opt/lampp, and says all the elements are secured by passwords (which I just assigned during this installation.) I assumed that Xamp-Windows installed in November would migrate easily to xampp-linux installed iin February -- a bad assumption. It apparently would have been simple if I had known enough to do an Export or a Dump before the crash, but.... The backup was done to a Network Attached Storage drive, which is formatted as "vfat" so the backup does not carry with it any valid ownership permissions from MySql on NTFS. I now see from my backup that the old data resided in \TestWeb\Xampp\Mysql\Data\Lws2\ and consists of 7 ".frm" files which define my tables. The actual data -- I suppose a ".sql" file or files -- has disappeared, and I am resigning myself to two days of retyping it. But I do not wish to do the table layouts all over again. So I copied Data tree to /opt/lampp/Data -- PhpMyAdmin does not see it. So I copied Lws2 tree to /opt/lampp/Lws2 -- PhpMyAdmin does not see it. So I copied Data tree to /opt/lampp/var/mysql/Data -- PhpMyAdmin does not see it. So I copied Lws2 tree to /opt/lampp/var/mysql/Lws2 -- PhpMyAdmin does not see it. So I adjusted all the permissions to stop saying owner "nobody" to owner "root" and gave full permissions to all groups and to all others, with permissions percolating down, in all 4 trees. You guessed it -- PhpMyAdmin does not see any database named Lws2, only its 4 default ones. I double-checked the permissions and rebooted Linux and repeated the tests. At some point in that process I did see PhpMyAdmin showing "lws2(7)" but when I clicked on it I saw a "no table found" message. I have not been able to recreate that experience. Apparently there are some setup files for MySql and for PhpMyAdmin which need to be set up by running a wizard or two or by editing the files directly. I grepped the TestWeb tree and found an old "ldir = "C:TestWeb\Xampp\MySql\" and a "DataDir = C:TestWeb\Xampp\MySql\" in a .php file and in a .bat file, but I cannot find the corresponding config file names on the /opt partition/ -- so it looks as if these wizards have not been run to create them. What config files files does Linux use to setup MySql config files for PhpMyAdmin? What wizards do I need to run to point the MySql engine and the PhpMyAdmin at the folder /opt/lampp/data/ with its lws2 folder inside it? Or which files do I need to edit, with a sample of what it normally says under Linux? Incidentally, I remember I converted from MyISAM with its .MYD and .MYI files to InnoDB after entering only a small amount of the data -- and I do not know what file types to look for -- perhaps my data is still there but under another guise or in another place? Is it something as simple as linux needing to see "/data/" instead of /Data? I will check that out while waiting for a response. If anyone can point me to documentation that discusses this level of detail -- I will read it avidly! In any case, thanks for any clarification you can give on this thorny problem. wizdum

Read the article
Storing game objects with generic object information

- by Mick

In a simple game object class, you might have something like this: public abstract class GameObject { protected String name; // other properties protected double x, y; public GameObject(String name, double x, double y) { // etc } // setters, getters } I was thinking, since a lot of game objects (ex. generic monsters) will share the same name, movement speed, attack power, etc, it would be better to have all that information shared between all monsters of the same type. So I decided to have an abstract class "ObjectData" to hold all this shared information. So whenever I create a generic monster, I would use the same pre-created "ObjectData" for it. Now the above class becomes more like this: public abstract class GameObject { protected ObjectData data; protected double x, y; public GameObject(ObjectData data, double x, double y) { // etc } // setters, getters public String getName() { return data.getName(); } } So to tailor this specifically for a Monster (could be done in a very similar way for Npcs, etc), I would add 2 classes. Monster which extends GameObject, and MonsterData which extends ObjectData. Now I'll have something like this: public class Monster extends GameObject { public Monster(MonsterData data, double x, double y) { super(data, x, y); } } This is where my design question comes in. Since MonsterData would hold data specific to a generic monster (and would vary with what say NpcData holds), what would be the best way to access this extra information in a system like this? At the moment, since the data variable is of type ObjectData, I'll have to cast data to MonsterData whenever I use it inside the Monster class. One solution I thought of is this, but this might be bad practice: public class Monster extends GameObject { private MonsterData data; // <- this part here public Monster(MonsterData data, double x, double y) { super(data, x, y); this.data = data; // <- this part here } } I've read that for one I should generically avoid overwriting the underlying classes variables. What do you guys think of this solution? Is it bad practice? Do you have any better solutions? Is the design in general bad? How should I redesign this if it is? Thanks in advanced for any replies, and sorry about the long question. Hopefully it all makes sense!

Read the article

Search Results

Search found 58440 results on 2338 pages for 'data cleansing'.

Page 36/2338 | < Previous Page | 32 33 34 35 36 37 38 39 40 41 42 43 | Next Page >

- by AjarnMark

- by Christian David Straub

- by Rizwanabbasi

- by Pureferret

- by Dharmik

- by Heinzi

- by Pete

- by Madhan ayyasamy

- by mateen

- by Nick

- by Tom

- by Archagon

- by Lukasz

- by Trixmix

- by chberger

- by Chuck H

- by Alex Kotopoulis

- by Petr Pudlák

- by dk

- by simendsjo

- by Javier Treviño

- by David Buehler

- by Mick

< Previous Page | 32 33 34 35 36 37 38 39 40 41 42 43 | Next Page >