Search Results

Search found 356 results on 15 pages for 'datasets'.

Page 5/15 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Google I/O 2011: GIS with Google Earth and Google Maps

Google I/O 2011: GIS with Google Earth and Google Maps Josh Livni, Mano Marks Building a robust interactive map with a lot of data involves more than just adding a few placemarks. We'll talk about integrating with existing GIS software, importing data from shapefiles and other formats, map projections, and techniques for managing, analyzing, and rendering large datasets. From: GoogleDevelopers Views: 3785 19 ratings Time: 52:25 More in Science & Technology

Read the article
Views from Abroad: XML Pipelines and Delta XML

A U.K.-based company uses XML to replicate the advantages of a pipeline in handling complex datasets. It is a simple tool, useful for such tasks as Java regression testing and version control, but the few tricks it does, it does well, according to our columnist.

Read the article
Justification for learning/implementing newer Microsoft technologies

- by Darren

I work at a large healthcare organization as a mid-level software developer. I have over 10 years experience in the IT industry using Microsoft technologies (ASP.NET & SQL Server). When I go to conferences, code camps, .net user group meetings, I hear of all kinds of new tools and technologies: MVC, LINQ, Entity Framework, WCF Web Services, etc. I guess you could say I'm in my comfort zone using the same old stuff from asp.net 2.0. I use typed datasets for my data access layer. I use web forms and feature rich server controls with master pages. I know how to use plain old SQL and create queries in my typed datasets to get at data my applications need. Throughout my career, I'm always sensitive to not become obsolete with my skill set. What I currently use works fine and my development time is fast. But I'm concerned that if I were to be laid off, I would be asked in interviews how many MVC apps I've written. Or how I am with LINQ or WCF web services. I know that it doesn't matter how many conferences, books, or videos I watch on some new technology...I have to implement/use it or it simply won't sink in. Also, managers who interview don't care how much someone reads up on something, only real use and experience with a technology. I have a new project to write. I've gone to my manager and have asked for additional time for the project for learning/implementing technology I may not be familiar with. Our organization encourages its employees to "learn and grow" and to continue are education. But I always get resistance when I ask for more time to ramp up on something new to implement. My manager is asking for concrete business reasons for implementing these new technologies. I don't have business reasons. My reasons are because I don't want to become obsolete. I could say it would make the project more maintainable in the future by other developers since at some point people could stop using these older technologies, but that' about all I can think of. Does Linq/Entity Framework/MCV apps perform better? So much so that the customers (users in departments I'm creating this app for) need? I doubt it. I'm interested in you guy's thoughts on this. Do many of you have similar plights with trying to use newer upcoming technologies? I doubt I'm on the bleeding edge of technology, either. Are there "business reasons" that you would bring to light for using these technologies? Thanks in advance! Sorry for the long wall of text.

Read the article
When to skip solving the general problem and settling for the specific problem?

- by Peter Smith

I've been working hard on trying to develop a general solution to my problem, but I cannot seem to formulate a proper algorithm for it, at least one that doesn't take a ton of inaccurate grunt work building a lookup table. I have a solution already for the specific requirement, but it requires the software's configuration to be changed every time the software is loaded with a different geographic area's datasets. So is it better to be finished and move on for now, or to keep attempting to solve the general problem knowing that the specific problems will keep popping up?

Read the article
New ZFS Encryption features in Solaris 11.1

- by darrenm

Solaris 11.1 brings a few small but significant improvements to ZFS dataset encryption. There is a new readonly property 'keychangedate' that shows that date and time of the last wrapping key change (basically the last time 'zfs key -c' was run on the dataset), this is similar to the 'rekeydate' property that shows the last time we added a new data encryption key. $ zfs get creation,keychangedate,rekeydate rpool/export/home/bob NAME PROPERTY VALUE SOURCE rpool/export/home/bob creation Mon Mar 21 11:05 2011 - rpool/export/home/bob keychangedate Fri Oct 26 11:50 2012 local rpool/export/home/bob rekeydate Tue Oct 30 9:53 2012 local The above example shows that we have changed both the wrapping key and added new data encryption keys since the filesystem was initially created. If we haven't changed a wrapping key then it will be the same as the creation date. It should be obvious but for filesystems that were created prior to Solaris 11.1 we don't have this data so it will be displayed as '-' instead. Another change that I made was to relax the restriction that the size of the wrapping key needed to match the size of the data encryption key (ie the size given in the encryption property). In Solaris 11 Express and Solaris 11 if you set encryption=aes-256-ccm we required that the wrapping key be 256 bits in length. This restriction was unnecessary and made it impossible to select encryption property values with key lengths 128 and 192 when the wrapping key was stored in the Oracle Key Manager. This is because currently the Oracle Key Manager stores AES 256 bit keys only. Now with Solaris 11.1 this restriciton has been removed. There is still one case were the wrapping key size and data encryption key size will always match and that is where they keysource property sets the format to be 'passphrase', since this is a key generated internally to libzfs and to preseve compatibility on upgrade from older releases the code will always generate a wrapping key (using PKCS#5 PBKDF2 as before) that matches the key length size of the encryption property. The pam_zfs_key module has been updated so that it allows you to specify encryption=off. There were also some bugs fixed including not attempting to load keys for datasets that are delegated to zones and some other fixes to error paths to ensure that we could support Zones On Shared Storage where all the datasets in the ZFS pool were encrypted that I discussed in my previous blog entry. If there are features you would like to see for ZFS encryption please let me know (direct email or comments on this blog are fine, or if you have a support contract having your support rep log an enhancement request).

Read the article
Google I/O 2012 - Knowledge-Based Application Design Patterns

Google I/O 2012 - Knowledge-Based Application Design Patterns Shawn Simister In this talk we'll look at emerging design patterns for building web applications that take advantage of large-scale, structured data. We'll look at open datasets like Wikipedia and Freebase as well as structured markup like Schema.org and RDFa to see what new types of applications these technologies open up for developers. For all I/O 2012 sessions, go to developers.google.com From: GoogleDevelopers Views: 1 0 ratings Time: 56:55 More in Science & Technology

Read the article
BIDS Helper 1.6 Beta Release (now with SQL 2012 support!)

- by Darren Gosbell

The beta for BIDS Helper 1.6 was just released. We have not updated the version notification just yet as we would like to get some feedback on people's experiences with the SQL 2012 version. So if you are using SQL 2012, go grab it and let us know how you go (you can post a comment on this blog post or on the BIDS Helper site itself). This is the first release that supports SQL 2012 and consequently also the first release that runs in Visual Studio 2010. A big thanks to Greg Galloway for doing the bulk of the work on this release. Please note that if you are doing an xcopy deploy that you will need to unblock the files you download or you will get a cryptic error message. This appears to be caused by a security update to either Visual Studio or the .Net framework – the xcopy deploy instructions have been updated to show you how to do this. Below are the notes from the release page. ====== This beta release is the first to support SQL Server 2012 (in addition to SQL Server 2005, 2008, and 2008 R2). Since it is marked as a beta release, we are looking for bug reports in the next few months as you use BIDS Helper on real projects. In addition to getting all existing BIDS Helper functionality working appropriately in SQL Server 2012 (SSDT), the following features are new... Analysis Services Tabular Smart Diff Tabular Actions Editor Tabular HideMemberIf Tabular Pre-Build Fixes and Updates The Unused Datasets feature for Reporting Services now accounts for new features in Reporting Services 2008 R2 like Lookups and new features in Reporting Services 2012. SSIS: emit an informational message when a variable has an expression defined and EvaluateAsExpression = False SSAS: roles reports points to wrong server SSIS - Variable Copy / Move broken in v1.5 "Unused DataSets Report" not showing up in Context menu on VS2005 if Solution Folders used SSAS Tabular: Create a UI for managing actions SSAS Tabular: Smart Diff improvements for new schema and Tabular models SSIS: Copy/Move Variable Erroring due to custom Control Flow item Icon SSIS Performance Visualization Index out of range fixing bugs in AggManager when aggregation design IDs don't match names The exe downloads are a self extracting installer, the zip downloads allow for an xcopy deploy. Make sure to note the updated xcopy deploy instructions for SQL Server 2012.

Read the article
Visualizing Data with the Google Maps API: A Journey of 245k Points

Visualizing Data with the Google Maps API: A Journey of 245k Points What can you do with some awesome geospatial data, the Google Maps API, and a couple of days of hacking and analysis? Brendan and Paul walk through how they used the Maps API to visualize the CLIWOC database, and pass on tips and trick for doing the same with other geospatial datasets. CLIWOC (Climatological Database for the World's Oceans, 1750-1850): www.ucm.es From: GoogleDevelopers Views: 0 0 ratings Time: 00:00 More in Education

Read the article
Google I/O 2010 - Batch data processing with App Engine

Google I/O 2010 - Batch data processing with App Engine Google I/O 2010 - Batch data processing with App Engine App Engine 201 Mike Aizatsky In this session, attendees will learn how to write map() functions, how to do simple reduce() operations, how to run these over large datasets, and how App Engine is used to accomplish such parallelism. For all I/O 2010 sessions, please go to code.google.com From: GoogleDevelopers Views: 6 0 ratings Time: 38:45 More in Science & Technology

Read the article
Stairway to MDX - Level 2: The Ordinal Function

Business Intelligence Architect Bill Pearson introduces the MDX Ordinal Function, as a means for generating lists and for conditionally presenting calculations. He also demonstrates the use of the function in creating datasets to support report parameter picklists. Develop seamlessly between Management Studio and Visual StudioSQL Connect is a Visual Studio add-in that makes it easy to keep your database and Visual Studio project in sync.

Read the article
How to mount a LOFS in Solaris that doesn’t cross mountpoints

- by jcea

I need to access my "root" ZFS dataset to delete a file under "/var". But "/var" is overlayed by another ZFS dataset. Since these are system datasets I can't "umount" them while the machine is running. And I want to avoid to reboot the system in "failsafe" mode, since this is a production machine. Teorically ZFS would refuse to mount "/var" dataset over the underlying "/var", because it is not empty. But it works, possibly because they are system datasets mounted early in the boot process. But having the underlying "/var" not empty is preventing me to create an ABE (Alternate Boot Environment), so patching is risky, and I can't upgrade my system using Live Upgrade. The machine is remote. I have an IP KVM, but I rather prefer to avoid booting this machine in "failsafe" mode, if I can. I know there is a file in "/var/" because I can snapshot the "root" dataset and check it. But snapshots are read-only, so I can't get rid of the file. I tried "mkdir /tmp/zzz; mount -F lofs / /tmp/zzz", but when I go to "/tmp/zzz/var", I see the "/var" dataset, not the underlying "root" dataset. That is, the LOFS is crossing mountpoints. I would usually like it, but not this time!. Any suggestion, beside rebooting the machine in "failsafe" and mess with it thru the IP KVM?

Read the article
Can MySQL use multiple data directories on different physical storage devices

- by sirlark

I am running MySQL with its data dir on a 128Gb SSD. I am dealing with large datasets (~20Gb) that are loaded and processed weekly, each stored in a separate DB for the purposes of time point comparisons. Putting all the data into a single database in unfeasible because the performance on such large databases is already a problem. However, I cannot keep more than 6 datasets on the SSD at a time. Right now I am manually dumping the oldest to much larger 2Tb spinning disk every week, and dropping the database to make space for the new one. But if I need one of the 'archived' databases (a semi regular occurrence) I have to drop a current one (after dumping), reload it, do what I need to, then reverse the results. Is there a way to configure MySQL to use multiple data directories, say one on the SSD and one on the 2Tb spinning disk, and 'merge' them transparently? If I could do this, then archiving would no longer mean "moved out of the database entirely", but instead would mean "moved onto the slow physical device". The time taken to do my queries on a spinning disk would be less than that taken to completely dump, drop, load, drop, reload two entire databases, so this is a win. I thought of using something like unionfs but I can't think of a way to control which database gets stored on which physical drive, because it works by merging on a directory level (from what I understand) so I'm still stuck with using multiple directories. Any help appreciated, thanks in advance

Read the article
To sample or not to sample...

- by [email protected]

Ideally, we would know the exact answer to every question. How many people support presidential candidate A vs. B? How many people suffer from H1N1 in a given state? Does this batch of manufactured widgets have any defective parts? Knowing exact answers is expensive in terms of time and money and, in most cases, is impractical if not impossible. Consider asking every person in a region for their candidate preference, testing every person with flu symptoms for H1N1 (assuming every person reported when they had flu symptoms), or destructively testing widgets to determine if they are "good" (leaving no product to sell). Knowing exact answers, fortunately, isn't necessary or even useful in many situations. Understanding the direction of a trend or statistically significant results may be sufficient to answer the underlying question: who is likely to win the election, have we likely reached a critical threshold for flu, or is this batch of widgets good enough to ship? Statistics help us to answer these questions with a certain degree of confidence. This focuses on how we collect data. In data mining, we focus on the use of data, that is data that has already been collected. In some cases, we may have all the data (all purchases made by all customers), in others the data may have been collected using sampling (voters, their demographics and candidate choice). Building data mining models on all of your data can be expensive in terms of time and hardware resources. Consider a company with 40 million customers. Do we need to mine all 40 million customers to get useful data mining models? The quality of models built on all data may be no better than models built on a relatively small sample. Determining how much is a reasonable amount of data involves experimentation. When starting the model building process on large datasets, it is often more efficient to begin with a small sample, perhaps 1000 - 10,000 cases (records) depending on the algorithm, source data, and hardware. This allows you to see quickly what issues might arise with choice of algorithm, algorithm settings, data quality, and need for further data preparation. Instead of waiting for a model on a large dataset to build only to find that the results don't meet expectations, once you are satisfied with the results on the initial sample, you can take a larger sample to see if model quality improves, and to get a sense of how the algorithm scales to the particular dataset. If model accuracy or quality continues to improve, consider increasing the sample size. Sampling in data mining is also used to produce a held-aside or test dataset for assessing classification and regression model accuracy. Here, we reserve some of the build data (data that includes known target values) to be used for an honest estimate of model error using data the model has not seen before. This sampling transformation is often called a split because the build data is split into two randomly selected sets, often with 60% of the records being used for model building and 40% for testing. Sampling must be performed with care, as it can adversely affect model quality and usability. Even a truly random sample doesn't guarantee that all values are represented in a given attribute. This is particularly troublesome when the attribute with omitted values is the target. A predictive model that has not seen any examples for a particular target value can never predict that target value! For other attributes, values may consist of a single value (a constant attribute) or all unique values (an identifier attribute), each of which may be excluded during mining. Values from categorical predictor attributes that didn't appear in the training data are not used when testing or scoring datasets. In subsequent posts, we'll talk about three sampling techniques using Oracle Database: simple random sampling without replacement, stratified sampling, and simple random sampling with replacement.

Read the article
Solaris 11.1: Encrypted Immutable Zones on (ZFS) Shared Storage

- by darrenm

Solaris 11 brought both ZFS encryption and the Immutable Zones feature and I've talked about the combination in the past. Solaris 11.1 adds a fully supported method of storing zones in their own ZFS using shared storage so lets update things a little and put all three parts together. When using an iSCSI (or other supported shared storage target) for a Zone we can either let the Zones framework setup the ZFS pool or we can do it manually before hand and tell the Zones framework to use the one we made earlier. To enable encryption we have to take the second path so that we can setup the pool with encryption before we start to install the zones on it. We start by configuring the zone and specifying an rootzpool resource: # zonecfg -z eizoss Use 'create' to begin configuring a new zone. zonecfg:eizoss> create create: Using system default template 'SYSdefault' zonecfg:eizoss> set zonepath=/zones/eizoss zonecfg:eizoss> set file-mac-profile=fixed-configuration zonecfg:eizoss> add rootzpool zonecfg:eizoss:rootzpool> add storage \ iscsi://zs7120-tvp540-c.uk.oracle.com/luname.naa.600144f09acaacd20000508e64a70001 zonecfg:eizoss:rootzpool> end zonecfg:eizoss> verify zonecfg:eizoss> commit zonecfg:eizoss> Now lets create the pool and specify encryption: # suriadm map \ iscsi://zs7120-tvp540-c.uk.oracle.com/luname.naa.600144f09acaacd20000508e64a70001 PROPERTY VALUE mapped-dev /dev/dsk/c10t600144F09ACAACD20000508E64A70001d0 # echo "zfscrypto" > /zones/p # zpool create -O encryption=on -O keysource=passphrase,file:///zones/p eizoss \ /dev/dsk/c10t600144F09ACAACD20000508E64A70001d0 # zpool export eizoss Note that the keysource example above is just for this example, realistically you should probably use an Oracle Key Manager or some other better keystorage, but that isn't the purpose of this example. Note however that it does need to be one of file:// https:// pkcs11: and not prompt for the key location. Also note that we exported the newly created pool. The name we used here doesn't actually mater because it will get set properly on import anyway. So lets go ahead and do our install: zoneadm -z eizoss install -x force-zpool-import Configured zone storage resource(s) from: iscsi://zs7120-tvp540-c.uk.oracle.com/luname.naa.600144f09acaacd20000508e64a70001 Imported zone zpool: eizoss_rpool Progress being logged to /var/log/zones/zoneadm.20121029T115231Z.eizoss.install Image: Preparing at /zones/eizoss/root. AI Manifest: /tmp/manifest.xml.ujaq54 SC Profile: /usr/share/auto_install/sc_profiles/enable_sci.xml Zonename: eizoss Installation: Starting ... Creating IPS image Startup linked: 1/1 done Installing packages from: solaris origin: http://pkg.us.oracle.com/solaris/release/ Please review the licenses for the following packages post-install: consolidation/osnet/osnet-incorporation (automatically accepted, not displayed) Package licenses may be viewed using the command: pkg info --license <pkg_fmri> DOWNLOAD PKGS FILES XFER (MB) SPEED Completed 187/187 33575/33575 227.0/227.0 384k/s PHASE ITEMS Installing new actions 47449/47449 Updating package state database Done Updating image state Done Creating fast lookup database Done Installation: Succeeded Note: Man pages can be obtained by installing pkg:/system/manual done. Done: Installation completed in 929.606 seconds. Next Steps: Boot the zone, then log into the zone console (zlogin -C) to complete the configuration process. Log saved in non-global zone as /zones/eizoss/root/var/log/zones/zoneadm.20121029T115231Z.eizoss.install That was really all we had to do, when the install is done boot it up as normal. The zone administrator has no direct access to the ZFS wrapping keys used for the encrypted pool zone is stored on. Due to how inheritance works in ZFS he can still create new encrypted datasets that use those wrapping keys (without them ever being inside a process in the zone) or he can create encrypted datasets inside the zone that use keys of his own choosing, the output below shows the two cases: rpool is inheriting the key material from the global zone (note we can see the value of the keysource property but we don't use it inside the zone nor does that path need to be (or is) accessible inside the zone). Whereas rpool/export/home/bob has set keysource locally. # zfs get encryption,keysource rpool rpool/export/home/bob NAME PROPERTY VALUE SOURCE rpool encryption on inherited from $globalzone rpool keysource passphrase,file:///zones/p inherited from $globalzone rpool/export/home/bob encryption on local rpool/export/home/bob keysource passphrase,prompt local

Read the article
extracting RDL data using LINQ

- by BobC

I'm working with some SQL Report definition files (RDLs), using LINQ to extract component query statements for validation. I'm trying to extract the <DataSet> elements from under the <DataSets> element. I seem to be getting hung up with one of the elements under <DataSet><Fields><Field> which has a namespace qualifier <rd:TypeName> I've been using LINQ to XML for other parts of the files where there is no namespace qualifiers with no trouble, by specifying a default namespace. The RDL specifies two namespaces: xmlns="http://schemas.microsoft.com/sqlserver/reporting/2005/01/reportdefinition" xmlns:rd="http://schemas.microsoft.com/SQLServer/reporting/reportdesigner"> When I try to get the <DataSets> element, however, I get the following error: System.Xml.XmlException - The ':' character, hexadecimal value 0x3A, cannot be included in a name. I know it has to do with the namespace qualifier (rd:) in one of the child elements, but I'm having difficulty getting a LINQ expression that works. Any help would be appreciated. Thanks!

Read the article
In Reporting Services how to filter drop down parameter list in based on other selected parameter?

- by Lee Englestone

Question In a Reporting Services Report, How do I filter a second drop down list of cars to only show cars whose ManufacturerId is equal the selected Manufacturer (from the first drop down list)? Report Datasets I have 2 datasets. Dataset 1. A list of Manufacturers. From a stored procedure Report_Manufacturers_P Dataset 2. A list of Cars, including a column called Manufacturers id. From a stored procedure Report_Cars_P Report Parameters On the Report I have 2 Parameters. Parameter 1. ManufacturerId. Set from A drop down list of Manufacturers (DataSet 1). Parameter 2. CarId. Set from A drop down list of Cars (DataSet 2). I've tried.. Creating another sproc called Report_Manufacturer_Cars_P that takes the ManufacturerId as an integer and returns a list of cars made by that manufacturer. Any Ideas. As selecting a Manufacturer doesn't seem to want to kick off anything that filters the Car list? Thanks in advance, -- Lee

Read the article
How do I pass a lot of parameters to views in Django?

- by Mark

I'm very new to Django and I'm trying to build an application to present my data in tables and charts. Till now my learning process went very smooth, but now I'm a bit stuck. My pageview retrieves large amounts of data from a database and puts it in the context. The template then generates different html-tables. So far so good. Now I want to add different charts to the template. I manage to do this by defining <img src=".../> tags. The Matplotlib chart is generate in my chartview an returned via: response=HttpResponse(content_type='image/png') canvas.print_png(response) return response Now I have different questions: the data is retrieved twice from the database. Once in the pageview to render the tables, and again in the chartview for making the charts. What is the best way to pass the data, already in the context of the page to the chartview? I need a lot of charts, each with different datasets. I could make a chartview for each chart, but probably there is a better way. How do I pass the different dataset names to the chartview? Some charts have 20 datasets, so I don't think that passing these dataset parameters via the url (like: <imgm src="chart/dataset1/dataset2/.../dataset20/chart.png />) is the right way. Any advice?

Read the article
php cli script hangs with no messages

- by julio

Hi-- I've written a PHP script that runs via SSH and nohup, meant to process records from a database and do stuff with them (eg. process some images, update some rows). It works fine with small loads, up to maybe 10k records. I have some larger datasets that process around 40k records (not a lot, I realize, but it adds up to a lot of work when each record requires the download and processing of up to 50 images). The larger datasets can take days to process. Sometimes I'll see in my debug logs memory errors, which are clear enough-- but sometimes the script just appears to "die" or go zombie on me. My tail of the debug log just stops, with no error messages, the tail of the nohup log ends with no error, and the process is still showing in a ps list, looking like this-- 26075 pts/0 S 745:01 /usr/bin/php ./import.php but no work is getting done. Can anyone give me some ideas on why a process would just quit? The obvious things (like a php script timeout and memory issues) are not a factor, as far as I can tell. Thanks for any tips PS-- this is hosted on a godaddy VDS (not my choice). I am sort of suspecting that godaddy has some kind of limits that might kick in on me despite what overrides I put in the code (such as set_time_limit(0);).

Read the article
Apply a recursive CTE on grouped table rows (SQL server 2005).

- by Evan V.

Hi all, I have a table (ROOMUSAGE) containing the times people check in and out of rooms grouped by PERSONKEY and ROOMKEY. It looks like this: PERSONKEY | ROOMKEY | CHECKIN | CHECKOUT | ROW ---------------------------------------------------------------- 1 | 8 | 13-4-2010 10:00 | 13-4-2010 11:00 | 1 1 | 8 | 13-4-2010 08:00 | 13-4-2010 09:00 | 2 1 | 1 | 13-4-2010 15:00 | 13-4-2010 16:00 | 1 1 | 1 | 13-4-2010 14:00 | 13-4-2010 15:00 | 2 1 | 1 | 13-4-2010 13:00 | 13-4-2010 14:00 | 3 13 | 2 | 13-4-2010 15:00 | 13-4-2010 16:00 | 1 13 | 2 | 13-4-2010 15:00 | 13-4-2010 16:00 | 2 I want to select just the consecutive rows for each PERSONKEY, ROOMKEY grouping. So the desired resulting table is: PERSONKEY | ROOMKEY | CHECKIN | CHECKOUT | ROW ---------------------------------------------------------------- 1 | 8 | 13-4-2010 10:00 | 13-4-2010 11:00 | 1 1 | 1 | 13-4-2010 15:00 | 13-4-2010 16:00 | 1 1 | 1 | 13-4-2010 14:00 | 13-4-2010 15:00 | 2 1 | 1 | 13-4-2010 13:00 | 13-4-2010 14:00 | 3 13 | 2 | 13-4-2010 15:00 | 13-4-2010 16:00 | 1 I want to avoid using cursors so I thought I would use a recursive CTE. Here is what I came up with: ;with CTE (PERSONKEY, ROOMKEY, CHECKIN, CHECKOUT, ROW) as (select RU.PERSONKEY, RU.ROOMKEY, RU.CHECKIN, RU.CHECKOUT, RU.ROW from ROOMUSAGE RU where RU.ROW = 1 union all select RU.PERSONKEY, RU.ROOMKEY, RU.CHECKIN, RU.CHECKOUT, RU.ROW from ROOMUSAGE RU inner join CTE on RU.ROWNUM = CTE.ROWNUM + 1 where CTE.CHECKIN = RU.CHECKOUT and CTE.PERSONKEY = RU.PERSONKEY and CTE.ROOMKEY = RU.ROOMKEY) This worked OK for very small datasets (under 100 records) but it's unusable on large datasets. I'm thinking that I should somehow apply the cte recursevely on each PERSONKEY, ROOMKEY grouping on my ROOMUSAGE table but I am not sure how to do that. Any help would be much appreciated, Cheers!

Read the article
How do I pass a lot of parameters to views in Dango?

- by Mark

I'm very new to Django and I'm trying to build an application to present my data in tables and charts. Till now my learning process went very smooth, but now I'm a bit stuck. My pageview retrieves large amounts of data from a database and puts it in the context. The template then generates different html-tables. So far so good. Now I want to add different charts to the template. I manage to do this by defining <img src=".../> tags. The Matplotlib chart is generate in my chartview an returned via: response=HttpResponse(content_type='image/png') canvas.print_png(response) return response Now I have different questions: the data is retrieved twice from the database. Once in the pageview to render the tables, and again in the chartview for making the charts. What is the best way to pass the data, already in the context of the page to the chartview? I need a lot of charts, each with different datasets. I could make a chartview for each chart, but probably there is a better way. How do I pass the different dataset names to the chartview? Some charts have 20 datasets, so I don't think that passing these dataset parameters via the url (like: <imgm src="chart/dataset1/dataset2/.../dataset20/chart.png />) is the right way. Any advice?

Read the article
L10N: Trusted test data for Locale Specific Sorting

- by Chris Betti

I'm working on an internationalized database application that supports multiple locales in a single instance. When international users sort data in the applications built on top of the database, the database theoretically sorts the data using a collation appropriate to the locale associated with the data the user is viewing. I'm trying to find sorted lists of words that meet two criteria: the sorted order follows the collation rules for the locale the words listed will allow me to exercise most / all of the specific collation rules for the locale I'm having trouble finding such trusted test data. Are such sort-testing datasets currently available, and if so, what / where are they? "words.en.txt" is an example text file containing American English text: Andrew Brian Chris Zachary I am planning on loading the list of words into my database in randomized order, and checking to see if sorting the list conforms to the original input. Because I am not fluent in any language other than English, I do not know how to create sample datasets like the following sample one in French (call it "words.fr.txt"): cote côte coté côté The French prefer diacritical marks to be ordered right to left. If you sorted that using code-point order, it likely comes out like this (which is an incorrect collation): cote coté côte côté Thank you for the help, Chris

Read the article
Generic dataset handling library

- by Pep.

Hello, I want to build a generic Perl module for handling and analysing biomedical character separated datasets and which can, most certain, be used on any kind of datasets that contain a mixture of categorical (A,B,C,..) and continuous (1.2,3,881..) and identifier (XXX1,XXX2...). The plan is to have people initialize the module and then use some arguments to point to the data file(s), the place were the analysis reports should be placed and the structure of the data. By structure of data I mean which variable is in which place and its name/type. And this is where I need some enlightenment. I am baffled how to do this in a clean way. Obviously, having people create a simple schema file, be it XML or some other format would be the cleanest but maybe not all people enjoy doing something like this. The solutions I can think of are: Create a configuration file in XML or similar and with a prespecified format. Pass the information during initialization of the module. Use the first row of the data as headers and try to guess types (ouch) Surely there must be a "canonical" way of doing this that is also usable and efficient. Thanks p.

Read the article
How can I build a generic dataset-handling Perl library?

- by Pep.

Hello, I want to build a generic Perl module for handling and analysing biomedical character separated datasets and which can, most certain, be used on any kind of datasets that contain a mixture of categorical (A,B,C,..) and continuous (1.2,3,881..) and identifier (XXX1,XXX2...). The plan is to have people initialize the module and then use some arguments to point to the data file(s), the place were the analysis reports should be placed and the structure of the data. By structure of data I mean which variable is in which place and its name/type. And this is where I need some enlightenment. I am baffled how to do this in a clean way. Obviously, having people create a simple schema file, be it XML or some other format would be the cleanest but maybe not all people enjoy doing something like this. The solutions I can think of are: Create a configuration file in XML or similar and with a prespecified format. Pass the information during initialization of the module. Use the first row of the data as headers and try to guess types (ouch) Surely there must be a "canonical" way of doing this that is also usable and efficient. Thanks p.

Read the article
Inconsistent results in R with RNetCDF - why?

- by sarcozona

I am having trouble extracting data from NetCDF data files using RNetCDF. The data files each have 3 dimensions (longitude, latitude, and a date) and 3 variables (latitude, longitude, and a climate variable). There are four datasets, each with a different climate variable. Here is some of the output from print.nc(p8m.tmax) for clarity. The other datasets are identical except for the specific climate variable. dimensions: month = UNLIMITED ; // (1368 currently) lat = 3105 ; lon = 7025 ; variables: float lat(lat) ; lat:long_name = "latitude" ; lat:standard_name = "latitude" ; lat:units = "degrees_north" ; float lon(lon) ; lon:long_name = "longitude" ; lon:standard_name = "longitude" ; lon:units = "degrees_east" ; short tmax(lon, lat, month) ; tmax:missing_value = -9999 ; tmax:_FillValue = -9999 ; tmax:units = "degree_celsius" ; tmax:scale_factor = 0.01 ; tmax:valid_min = -5000 ; tmax:valid_max = 6000 ; I am getting behavior I don't understand when I use the var.get.nc function from the RNetCDF package. For example, when I attempt to extract 82 values beginning at stval from the maximum temperature data (p8m.tmax <- open.nc(tmaxdataset.nc)) with > var.get.nc(p8m.tmax,'tmax', start=c(lon_val, lat_val, stval),count=c(1,1,82)) (where lon_val and lat_val specify the location in the dataset of the coordinates I'm interested in and stval is stval is set to which(time_vec==200201), which in this case equaled 1285.) I get Error: Invalid argument But after successfully extracting 80 and 81 values > var.get.nc(p8m.tmax,'tmax', start=c(lon_val, lat_val, stval),count=c(1,1,80)) > var.get.nc(p8m.tmax,'tmax', start=c(lon_val, lat_val, stval),count=c(1,1,81)) the command with 82 works: > var.get.nc(p8m.tmax,'tmax', start=c(lon_val, lat_val, stval),count=c(1,1,82)) [1] 444 866 1063 ... [output snipped] The same problem occurs in the identically structured tmin file, but at 36 instead of 82: > var.get.nc(p8m.tmin,'tmin', start=c(lon_val, lat_val, stval),count=c(1,1,36)) produces Error: Invalid argument But after repeating with counts of 30, 31, etc > var.get.nc(p8m.tmin,'tmin', start=c(lon_val, lat_val, stval), count=c(1,1,36)) works. These examples make it seem like the function is failing at the last count, but that actually isn't the case. In the first example, var.get.nc gave Error: Invalid argument after I asked for 84 values. I then narrowed the failure down to the 82nd count by varying the starting point in the dataset and asking for only 1 value at a time. The particular number the problem occurs at also varies. I can close and reopen the dataset and have the problem occur at a different location. In the particular examples above, lon_val and lat_val are 1595 and 1751, respectively, identifying the location in the dataset along the lat and lon dimensions for the latitude and longitude I'm interested in. The 1595th latitude and 1751th longitude are not the problem, however. The problem occurs with all other latitude and longitudes I've tried. Varying the starting location in the dataset along the climate variable dimension (stval) and/or specifying it different (as a number in the command instead of the object stval) also does not fix the problem. This problem doesn't always occur. I can run identical code three times in a row (clearing all objects in between runs) and get a different outcome each time. The first run may choke on the 7th entry I'm trying to get, the second might work fine, and the third run might choke on the 83rd entry. I'm absolutely baffled by such inconsistent behavior. The open.nc function has also started to fail with the same Error: Invalid argument. Like the var.get.nc problems, it also occurs inconsistently. Does anyone know what causes the initial failure to extract the variable? And how I might prevent it? Could have to do with the size of the data files (~60GB each) and/or the fact that I'm accessing them through networked drives? This was also asked here: https://stat.ethz.ch/pipermail/r-help/2011-June/281233.html > sessionInfo() R version 2.13.0 (2011-04-13) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] reshape_0.8.4 plyr_1.5.2 RNetCDF_1.5.2-2 loaded via a namespace (and not attached): [1] tools_2.13.0

Read the article
GDD-BR 2010 [2F] Storage, Bigquery and Prediction APIs

GDD-BR 2010 [2F] Storage, Bigquery and Prediction APIs Speaker: Patrick Chanezon Track: Cloud Computing Time slot: F [15:30 - 16:15] Room: 2 Level: 101 Google is expanding our storage products by introducing Google Storage for Developers. It offers a RESTful API for storing and accessing data at Google. Developers can take advantage of the performance and reliability of Google's storage infrastructure, as well as the advanced security and sharing capabilities. We will demonstrate key functionality of the product as well as customer use cases. Google relies heavily on data analysis and has developed many tools to understand large datasets. Two of these tools are now available on a limited sign-up basis to developers: (1) BigQuery: interactive analysis of very large data sets and (2) Prediction API: make informed predictions from your data. We will demonstrate their use and give instructions on how to get access. From: GoogleDevelopers Views: 1 0 ratings Time: 39:27 More in Science & Technology

Read the article

Search Results

Search found 356 results on 15 pages for 'datasets'.

Page 5/15 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Darren

- by Peter Smith

- by darrenm

- by Darren Gosbell

- by jcea

- by sirlark

- by [email protected]

- by darrenm

- by BobC

- by Lee Englestone

- by Mark

- by julio

- by Evan V.

- by Mark

- by Chris Betti

- by Pep.

- by Pep.

- by sarcozona

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >