Search Results

Search found 356 results on 15 pages for 'datasets'.

Page 15/15 | < Previous Page | 11 12 13 14 15

When someone deletes a shared data source in SSRS

- by Rob Farley

SQL Server Reporting Services plays nicely. You can have things in the catalogue that get shared. You can have Reports that have Links, Datasets that can be used across different reports, and Data Sources that can be used in a variety of ways too. So if you find that someone has deleted a shared data source, you potentially have a bit of a horror story going on. And this works for this month’s T-SQL Tuesday theme, hosted by Nick Haslam, who wants to hear about horror stories. I don’t write about LobsterPot client horror stories, so I’m writing about a situation that a fellow MVP friend asked me about recently instead. The best thing to do is to grab a recent backup of the ReportServer database, restore it somewhere, and figure out what’s changed. But of course, this isn’t always possible. And it’s much nicer to help someone with this kind of thing, rather than to be trying to fix it yourself when you’ve just deleted the wrong data source. Unfortunately, it lets you delete data sources, without trying to scream that the data source is shared across over 400 reports in over 100 folders, as was the case for my friend’s colleague. So, suddenly there’s a big problem – lots of reports are failing, and the time to turn it around is small. You probably know which data source has been deleted, but getting the shared data source back isn’t the hard part (that’s just a connection string really). The nasty bit is all the re-mapping, to get those 400 reports working again. I know from exploring this kind of stuff in the past that the ReportServer database (using its default name) has a table called dbo.Catalog to represent the catalogue, and that Reports are stored here. However, the information about what data sources these deployed reports are configured to use is stored in a different table, dbo.DataSource. You could be forgiven for thinking that shared data sources would live in this table, but they don’t – they’re catalogue items just like the reports. Let’s have a look at the structure of these two tables (although if you’re reading this because you have a disaster, feel free to skim past). Frustratingly, there doesn’t seem to be a Books Online page for this information, sorry about that. I’m also not going to look at all the columns, just ones that I find interesting enough to mention, and that are related to the problem at hand. These fields are consistent all the way through to SQL Server 2012 – there doesn’t seem to have been any changes here for quite a while. dbo.Catalog The Primary Key is ItemID. It’s a uniqueidentifier. I’m not going to comment any more on that. A minor nice point about using GUIDs in unfamiliar databases is that you can more easily figure out what’s what. But foreign keys are for that too… Path, Name and ParentID tell you where in the folder structure the item lives. Path isn’t actually required – you could’ve done recursive queries to get there. But as that would be quite painful, I’m more than happy for the Path column to be there. Path contains the Name as well, incidentally. Type tells you what kind of item it is. Some examples are 1 for a folder and 2 a report. 4 is linked reports, 5 is a data source, 6 is a report model. I forget the others for now (but feel free to put a comment giving the full list if you know it). Content is an image field, remembering that image doesn’t necessarily store images – these days we’d rather use varbinary(max), but even in SQL Server 2012, this field is still image. It stores the actual item definition in binary form, whether it’s actually an image, a report, whatever. LinkSourceID is used for Linked Reports, and has a self-referencing foreign key (allowing NULL, of course) back to ItemID. Parameter is an ntext field containing XML for the parameters of the report. Not sure why this couldn’t be a separate table, but I guess that’s just the way it goes. This field gets changed when the default parameters get changed in Report Manager. There is nothing in dbo.Catalog that describes the actual data sources that the report uses. The default data sources would be part of the Content field, as they are defined in the RDL, but when you deploy reports, you typically choose to NOT replace the data sources. Anyway, they’re not in this table. Maybe it was already considered a bit wide to throw in another ntext field, I’m not sure. They’re in dbo.DataSource instead. dbo.DataSource The Primary key is DSID. Yes it’s a uniqueidentifier... ItemID is a foreign key reference back to dbo.Catalog Fields such as ConnectionString, Prompt, UserName and Password do what they say on the tin, storing information about how to connect to the particular source in question. Link is a uniqueidentifier, which refers back to dbo.Catalog. This is used when a data source within a report refers back to a shared data source, rather than embedding the connection information itself. You’d think this should be enforced by foreign key, but it’s not. It does allow NULLs though. Flags this is an int, and I’ll come back to this. When a Data Source gets deleted out of dbo.Catalog, you might assume that it would be disallowed if there are references to it from dbo.DataSource. Well, you’d be wrong. And not because of the lack of a foreign key either. Deleting anything from the catalogue is done by calling a stored procedure called dbo.DeleteObject. You can look at the definition in there – it feels very much like the kind of Delete stored procedures that many people write, the kind of thing that means they don’t need to worry about allowing cascading deletes with foreign keys – because the stored procedure does the lot. Except that it doesn’t quite do that. If it deleted everything on a cascading delete, we’d’ve lost all the data sources as configured in dbo.DataSource, and that would be bad. This is fine if the ItemID from dbo.DataSource hooks in – if the report is being deleted. But if a shared data source is being deleted, you don’t want to lose the existence of the data source from the report. So it sets it to NULL, and it marks it as invalid. We see this code in that stored procedure. UPDATE [DataSource] SET [Flags] = [Flags] & 0x7FFFFFFD, -- broken link [Link] = NULL FROM [Catalog] AS C INNER JOIN [DataSource] AS DS ON C.[ItemID] = DS.[Link] WHERE (C.Path = @Path OR C.Path LIKE @Prefix ESCAPE '*') Unfortunately there’s no semi-colon on the end (but I’d rather they fix the ntext and image types first), and don’t get me started about using the table name in the UPDATE clause (it should use the alias DS). But there is a nice comment about what’s going on with the Flags field. What I’d LIKE it to do would be to set the connection information to a report-embedded copy of the connection information that’s in the shared data source, the one that’s about to be deleted. I understand that this would cause someone to lose the benefit of having the data sources configured in a central point, but I’d say that’s probably still slightly better than LOSING THE INFORMATION COMPLETELY. Sorry, rant over. I should log a Connect item – I’ll put that on my todo list. So it sets the Link field to NULL, and marks the Flags to tell you they’re broken. So this is your clue to fixing it. A bitwise AND with 0x7FFFFFFD is basically stripping out the ‘2’ bit from a number. So numbers like 2, 3, 6, 7, 10, 11, etc, whose binary representation ends in either 11 or 10 get turned into 0, 1, 4, 5, 8, 9, etc. We can test for it using a WHERE clause that matches the SET clause we’ve just used. I’d also recommend checking for Link being NULL and also having no ConnectionString. And join back to dbo.Catalog to get the path (including the name) of broken reports are – in case you get a surprise from a different data source being broken in the past. SELECT c.Path, ds.Name FROM dbo.[DataSource] AS ds JOIN dbo.[Catalog] AS c ON c.ItemID = ds.ItemID WHERE ds.[Flags] = ds.[Flags] & 0x7FFFFFFD AND ds.[Link] IS NULL AND ds.[ConnectionString] IS NULL; When I just ran this on my own machine, having deleted a data source to check my code, I noticed a Report Model in the list as well – so if you had thought it was just going to be reports that were broken, you’d be forgetting something. So to fix those reports, get your new data source created in the catalogue, and then find its ItemID by querying Catalog, using Path and Name to find it. And then use this value to fix them up. To fix the Flags field, just add 2. I prefer to use bitwise OR which should do the same. Use the OUTPUT clause to get a copy of the DSIDs of the ones you’re changing, just in case you need to revert something later after testing (doing it all in a transaction won’t help, because you’ll just lock out the table, stopping you from testing anything). UPDATE ds SET [Flags] = [Flags] | 2, [Link] = '3AE31CBA-BDB4-4FD1-94F4-580B7FAB939D' /*Insert your own GUID*/ OUTPUT deleted.Name, deleted.DSID, deleted.ItemID, deleted.Flags FROM dbo.[DataSource] AS ds JOIN dbo.[Catalog] AS c ON c.ItemID = ds.ItemID WHERE ds.[Flags] = ds.[Flags] & 0x7FFFFFFD AND ds.[Link] IS NULL AND ds.[ConnectionString] IS NULL; But please be careful. Your mileage may vary. And there’s no reason why 400-odd broken reports needs to be quite the nightmare that it could be. Really, it should be less than five minutes. @rob_farley

Read the article
Oracle BI Server Modeling, Part 1- Designing a Query Factory

- by bob.ertl(at)oracle.com

Welcome to Oracle BI Development's BI Foundation blog, focused on helping you get the most value from your Oracle Business Intelligence Enterprise Edition (BI EE) platform deployments. In my first series of posts, I plan to show developers the concepts and best practices for modeling in the Common Enterprise Information Model (CEIM), the semantic layer of Oracle BI EE. In this segment, I will lay the groundwork for the modeling concepts. First, I will cover the big picture of how the BI Server fits into the system, and how the CEIM controls the query processing. Oracle BI EE Query Cycle The purpose of the Oracle BI Server is to bridge the gap between the presentation services and the data sources. There are typically a variety of data sources in a variety of technologies: relational, normalized transaction systems; relational star-schema data warehouses and marts; multidimensional analytic cubes and financial applications; flat files, Excel files, XML files, and so on. Business datasets can reside in a single type of source, or, most of the time, are spread across various types of sources. Presentation services users are generally business people who need to be able to query that set of sources without any knowledge of technologies, schemas, or how sources are organized in their company. They think of business analysis in terms of measures with specific calculations, hierarchical dimensions for breaking those measures down, and detailed reports of the business transactions themselves. Most of them create queries without knowing it, by picking a dashboard page and some filters. Others create their own analysis by selecting metrics and dimensional attributes, and possibly creating additional calculations. The BI Server bridges that gap from simple business terms to technical physical queries by exposing just the business focused measures and dimensional attributes that business people can use in their analyses and dashboards. After they make their selections and start the analysis, the BI Server plans the best way to query the data sources, writes the optimized sequence of physical queries to those sources, post-processes the results, and presents them to the client as a single result set suitable for tables, pivots and charts. The CEIM is a model that controls the processing of the BI Server. It provides the subject areas that presentation services exposes for business users to select simplified metrics and dimensional attributes for their analysis. It models the mappings to the physical data access, the calculations and logical transformations, and the data access security rules. The CEIM consists of metadata stored in the repository, authored by developers using the Administration Tool client. Presentation services and other query clients create their queries in BI EE's SQL-92 language, called Logical SQL or LSQL. The API simply uses ODBC or JDBC to pass the query to the BI Server. Presentation services writes the LSQL query in terms of the simplified objects presented to the users. The BI Server creates a query plan, and rewrites the LSQL into fully-detailed SQL or other languages suitable for querying the physical sources. For example, the LSQL on the left below was rewritten into the physical SQL for an Oracle 11g database on the right. Logical SQL Physical SQL SELECT "D0 Time"."T02 Per Name Month" saw_0, "D4 Product"."P01 Product" saw_1, "F2 Units"."2-01 Billed Qty (Sum All)" saw_2 FROM "Sample Sales" ORDER BY saw_0, saw_1 WITH SAWITH0 AS ( select T986.Per_Name_Month as c1, T879.Prod_Dsc as c2, sum(T835.Units) as c3, T879.Prod_Key as c4 from Product T879 /* A05 Product */ , Time_Mth T986 /* A08 Time Mth */ , FactsRev T835 /* A11 Revenue (Billed Time Join) */ where ( T835.Prod_Key = T879.Prod_Key and T835.Bill_Mth = T986.Row_Wid) group by T879.Prod_Dsc, T879.Prod_Key, T986.Per_Name_Month ) select SAWITH0.c1 as c1, SAWITH0.c2 as c2, SAWITH0.c3 as c3 from SAWITH0 order by c1, c2 Probably everybody reading this blog can write SQL or MDX. However, the trick in designing the CEIM is that you are modeling a query-generation factory. Rather than hand-crafting individual queries, you model behavior and relationships, thus configuring the BI Server machinery to manufacture millions of different queries in response to random user requests. This mass production requires a different mindset and approach than when you are designing individual SQL statements in tools such as Oracle SQL Developer, Oracle Hyperion Interactive Reporting (formerly Brio), or Oracle BI Publisher. The Structure of the Common Enterprise Information Model (CEIM) The CEIM has a unique structure specifically for modeling the relationships and behaviors that fill the gap from logical user requests to physical data source queries and back to the result. The model divides the functionality into three specialized layers, called Presentation, Business Model and Mapping, and Physical, as shown below. Presentation services clients can generally only see the presentation layer, and the objects in the presentation layer are normally the only ones used in the LSQL request. When a request comes into the BI Server from presentation services or another client, the relationships and objects in the model allow the BI Server to select the appropriate data sources, create a query plan, and generate the physical queries. That's the left to right flow in the diagram below. When the results come back from the data source queries, the right to left relationships in the model show how to transform the results and perform any final calculations and functions that could not be pushed down to the databases. Business Model Think of the business model as the heart of the CEIM you are designing. This is where you define the analytic behavior seen by the users, and the superset library of metric and dimension objects available to the user community as a whole. It also provides the baseline business-friendly names and user-readable dictionary. For these reasons, it is often called the "logical" model--it is a virtual database schema that persists no data, but can be queried as if it is a database. The business model always has a dimensional shape (more on this in future posts), and its simple shape and terminology hides the complexity of the source data models. Besides hiding complexity and normalizing terminology, this layer adds most of the analytic value, as well. This is where you define the rich, dimensional behavior of the metrics and complex business calculations, as well as the conformed dimensions and hierarchies. It contributes to the ease of use for business users, since the dimensional metric definitions apply in any context of filters and drill-downs, and the conformed dimensions enable dashboard-wide filters and guided analysis links that bring context along from one page to the next. The conformed dimensions also provide a key to hiding the complexity of many sources, including federation of different databases, behind the simple business model. Note that the expression language in this layer is LSQL, so that any expression can be rewritten into any data source's query language at run time. This is important for federation, where a given logical object can map to several different physical objects in different databases. It is also important to portability of the CEIM to different database brands, which is a key requirement for Oracle's BI Applications products. Your requirements process with your user community will mostly affect the business model. This is where you will define most of the things they specifically ask for, such as metric definitions. For this reason, many of the best-practice methodologies of our consulting partners start with the high-level definition of this layer. Physical Model The physical model connects the business model that meets your users' requirements to the reality of the data sources you have available. In the query factory analogy, think of the physical layer as the bill of materials for generating physical queries. Every schema, table, column, join, cube, hierarchy, etc., that will appear in any physical query manufactured at run time must be modeled here at design time. Each physical data source will have its own physical model, or "database" object in the CEIM. The shape of each physical model matches the shape of its physical source. In other words, if the source is normalized relational, the physical model will mimic that normalized shape. If it is a hypercube, the physical model will have a hypercube shape. If it is a flat file, it will have a denormalized tabular shape. To aid in query optimization, the physical layer also tracks the specifics of the database brand and release. This allows the BI Server to make the most of each physical source's distinct capabilities, writing queries in its syntax, and using its specific functions. This allows the BI Server to push processing work as deep as possible into the physical source, which minimizes data movement and takes full advantage of the database's own optimizer. For most data sources, native APIs are used to further optimize performance and functionality. The value of having a distinct separation between the logical (business) and physical models is encapsulation of the physical characteristics. This encapsulation is another enabler of packaged BI applications and federation. It is also key to hiding the complex shapes and relationships in the physical sources from the end users. Consider a routine drill-down in the business model: physically, it can require a drill-through where the first query is MDX to a multidimensional cube, followed by the drill-down query in SQL to a normalized relational database. The only difference from the user's point of view is that the 2nd query added a more detailed dimension level column - everything else was the same. Mappings Within the Business Model and Mapping Layer, the mappings provide the binding from each logical column and join in the dimensional business model, to each of the objects that can provide its data in the physical layer. When there is more than one option for a physical source, rules in the mappings are applied to the query context to determine which of the data sources should be hit, and how to combine their results if more than one is used. These rules specify aggregate navigation, vertical partitioning (fragmentation), and horizontal partitioning, any of which can be federated across multiple, heterogeneous sources. These mappings are usually the most sophisticated part of the CEIM. Presentation You might think of the presentation layer as a set of very simple relational-like views into the business model. Over ODBC/JDBC, they present a relational catalog consisting of databases, tables and columns. For business users, presentation services interprets these as subject areas, folders and columns, respectively. (Note that in 10g, subject areas were called presentation catalogs in the CEIM. In this blog, I will stick to 11g terminology.) Generally speaking, presentation services and other clients can query only these objects (there are exceptions for certain clients such as BI Publisher and Essbase Studio). The purpose of the presentation layer is to specialize the business model for different categories of users. Based on a user's role, they will be restricted to specific subject areas, tables and columns for security. The breakdown of the model into multiple subject areas organizes the content for users, and subjects superfluous to a particular business role can be hidden from that set of users. Customized names and descriptions can be used to override the business model names for a specific audience. Variables in the object names can be used for localization. For these reasons, you are better off thinking of the tables in the presentation layer as folders than as strict relational tables. The real semantics of tables and how they function is in the business model, and any grouping of columns can be included in any table in the presentation layer. In 11g, an LSQL query can also span multiple presentation subject areas, as long as they map to the same business model. Other Model Objects There are some objects that apply to multiple layers. These include security-related objects, such as application roles, users, data filters, and query limits (governors). There are also variables you can use in parameters and expressions, and initialization blocks for loading their initial values on a static or user session basis. Finally, there are Multi-User Development (MUD) projects for developers to check out units of work, and objects for the marketing feature used by our packaged customer relationship management (CRM) software. The Query Factory At this point, you should have a grasp on the query factory concept. When developing the CEIM model, you are configuring the BI Server to automatically manufacture millions of queries in response to random user requests. You do this by defining the analytic behavior in the business model, mapping that to the physical data sources, and exposing it through the presentation layer's role-based subject areas. While configuring mass production requires a different mindset than when you hand-craft individual SQL or MDX statements, it builds on the modeling and query concepts you already understand. The following posts in this series will walk through the CEIM modeling concepts and best practices in detail. We will initially review dimensional concepts so you can understand the business model, and then present a pattern-based approach to learning the mappings from a variety of physical schema shapes and deployments to the dimensional model. Along the way, we will also present the dimensional calculation template, and learn how to configure the many additivity patterns.

Read the article
CodePlex Daily Summary for Wednesday, November 21, 2012

CodePlex Daily Summary for Wednesday, November 21, 2012Popular ReleasesImapX 2: ImapX 2.0.0.6: An updated release of the ImapX 2 library, containing many bugfixes for both, the library and the sample application.Metodología General Ajustada - MGA: 03.05.05: Cambios John: Se modificó el Procedimiento Alamacenado PROF03ObjetivoProductoConsultarIdF03 que no incluía los campos de IdUnidadMedida y UnidadMedida, lo que generaba error en la capa de datos al leer estos campos (PasarDataSetAPROF03ObjetivoProductoInfo) y terminaba devolviendo NULL en los registros, esto no dejaba la información en la Exportación y por ende en la Importación no subían los Productos. Generación de instaladores. Soporte técnico por correo electrónico, telefónico y en sitio.WiX Toolset: WiX v3.7 RC: WiX v3.7 RC (3.7.1119.0) provides feature complete Bundle update and reference tracking plus several bug fixes. For more information see Rob's blog post about the release: http://robmensching.com/blog/posts/2012/11/20/WiX-v3.7-Release-Candidate-availablePicturethrill: Version 2.11.20.0: Fixed up Bing image provider on Windows 8Excel AddIn to reset the last worksheet cell: XSFormatCleaner.xla: Modified the commandbar code to use CommandBar IDs instead of English names.Json.NET: Json.NET 4.5 Release 11: New feature - Added ITraceWriter, MemoryTraceWriter, DiagnosticsTraceWriter New feature - Added StringEscapeHandling with options to escape HTML and non-ASCII characters New feature - Added non-generic JToken.ToObject methods New feature - Deserialize ISet<T> properties as HashSet<T> New feature - Added implicit conversions for Uri, TimeSpan, Guid New feature - Missing byte, char, Guid, TimeSpan and Uri explicit conversion operators added to JToken New feature - Special case...EntitiesToDTOs - Entity Framework DTO Generator: EntitiesToDTOs.v3.0: DTOs and Assemblers can be generated inside project folders! Choose the types you want to generate! Support for Visual Studio 2012 !!! Support for new Entity Framework EDMX (format used by VS2012) ! Support for Enum Types! Optional automatic check for updates! Added the following methods to Assemblers! IEnumerable<DTO>.ToEntities() : ICollection<Entity> IEnumerable<Entity>.ToDTOs() : ICollection<DTO> Indicate class identifier for DTOs and Assemblers! Cleaner Assemblers code....mojoPortal: 2.3.9.4: see release notes on mojoportal.com http://www.mojoportal.com/mojoportal-2394-released Note that we have separate deployment packages for .NET 3.5 and .NET 4.0, but we recommend you to use .NET 4, we will probably drop support for .NET 3.5 once .NET 4.5 is available The deployment package downloads on this page are pre-compiled and ready for production deployment, they contain no C# source code and are not intended for use in Visual Studio. To download the source code see getting the lates...VidCoder: 1.4.6 Beta: Brought back the x264 advanced options panel due to popular demand. Thank you for all the feedback. x264 Preset/Profile/Tune/Level has been moved back to the Video tab, along with a copy of the "extra options" string. Added Fast Decode and Zero Latency checkboxes to support multiple Tunes. Added cropping option "None". Audio bitrates that are incompatible with the encoder (such as MP3 > 320 kbps) are no longer preset on the list. Fixed crash on opening VidCoder after de-selecting "re...DotNetNuke® Store: 03.01.07: What's New in this release? IMPORTANT: this version requires DotNetNuke 04.06.02 or higher! DO NOT REPORT BUGS HERE IN THE ISSUE TRACKER, INSTEAD USE THE DotNetNuke Store Forum! Bugs corrected: - Replaced some hard coded references to the default address provider classes by the corresponding interfaces to allow the creation of another address provider with a different name. New Features: - Added the 'pickup' delivery option at checkout. - Added the 'no delivery' option in the Store Admin ...Bundle Transformer - a modular extension for ASP.NET Web Optimization Framework: Bundle Transformer 1.6.10: Version: 1.6.10 Published: 11/18/2012 Now almost all of the Bundle Transformer's assemblies is signed (except BundleTransformer.Yui.dll); In BundleTransformer.SassAndScss the SassAndCoffee.Ruby library was replaced by my own implementation of the Sass- and SCSS-compiler (based on code of the SassAndCoffee.Ruby library version 2.0.2.0); In BundleTransformer.CoffeeScript added support of CoffeeScript version 1.4.0-3; In BundleTransformer.TypeScript added support of TypeScript version 0....ExtJS based ASP.NET 2.0 Controls: FineUI v3.2.0: +2012-11-18 v3.2.0 -?????????????????SelectedValueArray????????（◇?◆:）。 -???????????????????RecoverPropertiesFromJObject????（〓?〓、????、??、Vian_Pan）。 -????????????，?????????????，???SelectedValueArray???????（sam.chang）。 -??Alert.Show???????????（swtseaman）。 -???????????????，??Icon??IconUrl????（swtseaman）。 -?????????TimePicker（??）。 -?????????，??/res.axd?css=blue.css&v=1。 -????????，?????????????，???????。 -????MenuCheckBox（???????）。 -?RadioButton??AutoPostBack??。 -???????FCKEditor?????????...BugNET Issue Tracker: BugNET 1.2: Please read our release notes for BugNET 1.2: http://blog.bugnetproject.com/bugnet-1-2-has-been-released Please do not post questions as reviews. Questions should be posted in the Discussions tab, where they will usually get promptly responded to. If you post a question as a review, you will pollute the rating, and you won't get an answer.Paint.NET PSD Plugin: 2.2.0: Changes: Layer group visibility is now applied to all layers within the group. This greatly improves the visual fidelity of complex PSD files that have hidden layer groups. Layer group names are prefixed so that users can get an indication of the layer group hierarchy. (Paint.NET has a flat list of layers, so the hierarchy is flattened out on load.) The progress bar now reports status when saving PSD files, instead of showing an indeterminate rolling bar. Performance improvement of 1...CRM 2011 Visual Ribbon Editor: Visual Ribbon Editor (1.3.1116.7): [IMPROVED] Detailed error message descriptions for FaultException [FIX] Fixed bug in rule CrmOfflineAccessStateRule which had incorrect State attribute name [FIX] Fixed bug in rule EntityPropertyRule which was missing PropertyValue attribute [FIX] Current connection information was not displayed in status bar while refreshing list of entitiesSuper Metroid Randomizer: Super Metroid Randomizer v5: v5 -Added command line functionality for automation purposes. -Implented Krankdud's change to randomize the Etecoon's item. NOTE: this version will not accept seeds from a previous version. The seed format has changed by necessity. v4 -Started putting version numbers at the top of the form. -Added a warning when suitless Maridia is required in a parsed seed. v3 -Changed seed to only generate filename-legal characters. Using old seeds will still work exactly the same. -Files can now be saved...Caliburn Micro: WPF, Silverlight, WP7 and WinRT/Metro made easy.: Caliburn.Micro v1.4: Changes This version includes many bug fixes across all platforms, improvements to nuget support and...the biggest news of all...full support for both WinRT and WP8. Download Contents Debug and Release Assemblies Samples Readme.txt License.txt Packages Available on Nuget Caliburn.Micro – The full framework compiled into an assembly. Caliburn.Micro.Start - Includes Caliburn.Micro plus a starting bootstrapper, view model and view. Caliburn.Micro.Container – The Caliburn.Micro invers...DirectX Tool Kit: November 15, 2012: November 15, 2012 Added support for WIC2 when available on Windows 8 and Windows 7 with KB 2670838 Cleaned up warning level 4 warningsDotNetNuke® Community Edition CMS: 06.02.05: Major Highlights Updated the system so that it supports nested folders in the App_Code folder Updated the Global Error Handling so that when errors within the global.asax handler happen, they are caught and shown in a page displaying the original HTTP error code Fixed issue that stopped users from specifying Link URLs that open on a new window Security FixesFixed issue in the Member Directory module that could show members to non authenticated users Fixed issue in the Lists modul...fastJSON: v2.0.10: - added MonoDroid projectNew Projects1121codeplex01: Today's task is to test portal on JapaneseAgileToDo: a to do list use wpf ef sqlce!Applay: Applay is a library that allows you to wrap authorization and validation around the services of your application layer by using a dynamic proxy.ArunimaErp: Enterprise Resource Planning Software for Arunima GroupBootCMS: BootCMS makes webdevelopment easy.codeplex01: I need go out for a whileCoding4Fun's Maelstrom: Introduced at //build/ 2012, Maelstrom is Coding4Fun's latest creation. Step up to the podium and battle against your opponent in full-on stereoscopic 3D!CTCS Project 2012: 11/21/2012 @ svn repository ctodo: TODO List Management Librarydeploy-with-ease: One-click deployment tool based on DrobBox files hostingDesign Resources .NET: D-R.NET is a set of pre-built implementations of oft-recurring application designs. D-R.NET saves considerable time and money in building user-focused applications: from basic to complex. DnfWeb: dnf??Dr.Peng: dr.peng ????????。DriveKeepAlive: Managed .NET service intended to keep external hard drives "awake" for immediate access. Developed in C# with Visual Studio 2008Ecommerce Platform: Ecommerce PlatformEventManagerReset: Project created for Reset meetings.FaceComparerDistributed: project of face compare distributed versionFileSystemExplorerExample: WPF MVVM Sample applicationFinlogiK ReSharper Contrib: FinlogiK ReSharper Contrib is a plugin for ReSharper 5.1 which adds code cleanup and inspection options for static qualifiers.Gestione Lampade Votive: Gestione dei canoni annuali dei loculi cimiteriali, con stampa di comunicazioni ai contribuenti e dei bollettini di conto corrente postale (a due o tre cedole).GI_PII: HABA BABA?GIV_P2: second projectHex o'clock: Projekt kolorowego zegara.IISProcessScheduler: Schedule processes from within IIS.Image Tagger & Resizer: Resize, and text in the lower right of picture with i.e. copyright information.IT Kohvik: ITK cafe school project.jean1121codeplex01: goodKooboo3 Helper: It's a developer tutorial code for kooboo cms v3. http://kooboo.codeplex.comMAVI: mobile application for the visually impaired: bill recognition & tag and recognize objects based on a specific stickerMecanismos de Segurança Interoperáveis para Serviços Web: Esse projeto pretende desenvolver um framework que forneça requisitos de segurança de forma interoperável através de Serviços Web. Metro UI For Windows Forms: Provides a set of controls and form templates for designing user interfaces based on a similar minimalist metro style. For those who love Windows Forms.NHSmartBootstrapper: In a "fast-changing" world, your LoB application needs to be ready to change as well. The usage of NHibernate Listeners together with smart application bootstrapping, even in a complex scenario, can lead to extensible and new-feature-ready applications. Office Add-In Monitor: Office Add-in Monitor protects add-ins from being disabled.Orchard Responsive Theme Machine: A responsive version of the Orchard CMS "Theme Machine" which is commonly used as a starting point for building custom themes. Supports many resolutions.Orchard Simple Contact Form: An Orchard CMS module that provides a simple contact form that sends an email. It can be used as either a content part or widget.Peon War: Peon war is a game where peons are fighting.Project Files Linker (VS Add-In): PFL project is used to generate multiple projects with links to the same files to achieve projects for different .NET FW versions.Quibbler - Universal News Reader: Quibbler is a product designed and developed by Indigo Architects. Quibbler is a desktop application which runs on user's machine and provides a intuitive user interface for reading news in offline mode. Quibbler is developed in WPF (.Net 3.5).Samcrypt: .SenchaTest: SenchaTestShared Genomics Project - Workbench Codebase: The Shared Genomics workbench enables a diverse user group of researchers to explore the associations between genetic and other factors in their datasets. It provides a graphical user interface to the analysis functions published in a sister Codeplex project i.e. MPI Codebase.SharePoint 2013 FBA Pack: This is the home of the SharePoint 2013 FBA Pack. The FBA Pack for SharePoint 2013 is currently in development and is coming soon.SharePoint Term Store PowerShell Backup & Restore Scripts: This project is focused on development of PowerShell script tools for backup and restore of SharePoint Managed Metadata service application Term Store taxonomy.SharpPlanets: A simple game completely designed and written in C#, inspired by JPlanets.SpaceShooter: A small hobbyist game. It is similar to the 2D Arcade shooter games.Stretched Background Image jQuery plugin: jQuery plugin for adding a stretched background image for any element in a web page. Uses an absolutely positioned image at z-index -1.Stsadm Templates for Visual Studio: The Stsadm Templates for Visual Studio 2005 and 2008 support you in making command extensions for SharePoint's commnand line tool stsadm.exe.SwissPost EasyTrack API: The SwissPost EasyTrack API allows you to track your parcels or letters everytime and in every application.System.Threading.Joins: The Joins project provides asynchronous concurrency semantics based on join calculus and modeled after the Microsoft Research C? (C Omega) project.T nagu Tetris: Meie versioon tuntud mängust tetris.testdd11202012tfs01: juktestddhg11202012hg01: stesttom11202012git02: fdsfdstesttom11202012hg01: gfdTetrissimus: Tetrissimus is an open source "Tetris" alike game totally written in DHTML (JavaScript, CSS and HTML) that uses keyboard. This cross-platform and cross-browser game was tested under BeOS, Linux, NetBSD, OpenBSD, FreeBSD, Windows and others.Thrift Client .NET for WinRT (Windows Store Apps): thrift .net client for WinRT applicationTwitter Bootstrap for SharePoint: A Masterpage for SharePoint 2010 including the twitter bootstrap front-end frameworkTX Spell .NET ActiveX Package: TX Spell .NET ActiveX Package enables you to add high-performance spell checking capabilities to your VB6 applications.USB ACCELEROMETER: This project is a test demo for usb accelerometer. Application plays music (mp3 file) while usb acc gives high values from its coordinate between interval.VfaAccoutApps: Cash Payment Application of Vf AsiaVisualPoint Use PowerPoint inside Visual Studio: VisualPoint lets you show PowerPoint presentations from inside Visual Studio. Future release will automate walkthroughs and presentations.VS2010 Rc1 Fix: Illustrates a fix for working with the ASAP.NET Wizard control with VS2010 RC1WebSite.Request: WebSite.Request launch web request (via XMLHTTP) on website. Use, for example, to make initial request to sharepoint URL and escape "slow first request" problem.WPF Checked ListBox: This is simple implementation of WPF Checked ListBoxWPortal: doing nothing. that's it. i just want to use the subversion management. XNA Capture the Flag for the Microsoft Zune: Capture the Flag is a 2d Capture the flag game made for the Zune platform using XNA 3.0 CTP. Players choose to join or start a network session in the main menu. When in game, the player uses left or right on the DPad to choose the team on which to play with. Once sides have been chosen the party leader presses the center button on the Dpad to start the game. Teams switch between offense and defense for a total of 4 rounds in each game. When the game is over the party leader simply presses th...XPS Indexer: Xps file indexing for Google Desktop

Read the article
CodePlex Daily Summary for Friday, March 18, 2011

CodePlex Daily Summary for Friday, March 18, 2011Popular ReleasesCraig's Utility Library: Craig's Utility Library 2.1: This update contains the following functionality additions: Added Min and Max functions to MathHelper Added code for handling comments to BlogML code. Added WMI based file search ability (at present only searches based on file extension). Added CellularTexture class Added FaultFormation class Added PerlinNoise class Added Akismet helper classes Added Gravatar image link generator Added PipeDelimited class Added FaultFormation class Added FluvialErosion functions to Image c...DirectQ: Release 1.8.7 (RC3): Release Candidate 3 of 1.8.7 fixing many bugs and adding some new functionality.Catel - WPF, Silverlight and WP7 MVVM library: 1.3: Catel history ============= (+) Added (*) Changed (-) Removed (x) Error / bug (fix) For more information about issues or new feature requests, please visit: http://catel.codeplex.com =========== Version 1.3 =========== Release date: ============= 2011/03/18 Added/fixed: ============ (+) Added BindingHelper class to evaluate binding values manually (+) UserControl<TViewModel> now supports master-detail views where the detail view would have a nested UserControl<TViewModel> and no parent ...CBM-Command: Version 2.0 - 2011-03-17 - Final Release: This is the final release of CBM-Command Version 2.0. This version is intended to replace all prior versions you may have downloaded. Please see release notes for prior versions to get comprehensive list of changes. New features since RC1: - (C64) Added double-buffering when displaying the directory in a panel. This eliminates the flickering that users were experiencing when scrolling through long directories. Changes since RC1: - (All Machines) Changed the algorithm for displaying the d...Phalanger - The PHP Language Compiler for the .NET Framework: 2.1 (March 2011) for .NET 4.0: Introducing release of Phalanger 2.1 for .NET 4.0. This release brings big performance boost about 20% in most of the operations. This improvement can be expected also in an overall performance in many PHP applications, e.g. Wordpress. It is the first release that targets .NET Framework 4.0 which allows developers to move the project forward. To migrate your old Phalanger applications from Phalanger 2.0 to 2.1 please follow Migration to 2.1. Installation package also includes basic version o...NodeXL: Network Overview, Discovery and Exploration for Excel: NodeXL Excel Template, version 1.0.1.164: The NodeXL Excel template displays a network graph using edge and vertex lists stored in an Excel 2007 or Excel 2010 workbook. What's NewThis release adds new layout options for groups, makes some minor feature improvements, and fixes a few bugs. See the Complete NodeXL Release History for details. Installation StepsFollow these steps to install and use the template: Download the Zip file. Unzip it into any folder. Use WinZip or a similar program, or just right-click the Zip file in Wi...Leage of Legends Masteries Tool: LoLMasterSave_v1.6.1.274: -Addresses resent LoL update that interfered with the way MasterSave sets / reads masteries - Removed Shift windows since some people experiencing issues If your interested in this function i can provide it as small separate tool.ASP.NET Comet Ajax Library (Reverse Ajax - Server Push): ASP.NET Server Push Samples: This package contains 14 sample projects.LogExpert: 1.4 build 4092: TabControl: Tooltip on dropdown list shows full path names now New menu item "Lock instance" in Options menu. Only available when "Allow only one instance" is disabled in the settings. "Lock instance" will temporary enable the single instance mode. The locked instance will receive all new launched files Some NullPtrExceptions fixed (e.g. in the settings dialog) Note: The debug build is identical to the release build. But the debug version writes a log file. It also contains line numbers ...Facebook C# SDK: 5.0.6 (BETA): This is seventh BETA release of the version 5 branch of the Facebook C# SDK. Remember this is a BETA build. Some things may change or not work exactly as planned. We are absolutely looking for feedback on this release to help us improve the final 5.X.X release. New in this release: Version 5.0.6 is almost completely backward compatible with 4.2.1 and 5.0.3 (BETA) Bug fixes and helpers to simplify many common scenarios For more information about this release see the following blog posts: F...DotNetNuke® Community Edition: 06.00.00 CTP: CTP 1 (Build 155) is firmly focused around our conversion to C#. As many people have noted, this is a significant change to the platform and affects all areas of the product. This is one of the driving factors in why we felt it was important to get this release into your hands as soon as possible. We have already done quite a bit of testing on this feature internally and have fixed a number of issues in this area. We also recognize that there are probably still some more bugs to be found ...Kooboo CMS: Kooboo 3.0 RC: Bug fixes Inline editing toolbar positioning for websites with complicate CSS. Inline editing is turned on by default now for the samplesite template. MongoDB version content query for multiple filters. . Add a new 404 page to guide users to login and create first website. Naming validation for page name and datarule name. Files in this download kooboo_CMS.zip: The Kooboo application files Content_DBProvider.zip: Additional content database implementation of MSSQL,SQLCE, RavenDB ...SQL Monitor - tracking sql server activities: SQL Monitor 3.2: 1. introduce sql color syntax highlighting with http://www.codeproject.com/KB/edit/FastColoredTextBox_.aspxUmbraco CMS: Umbraco 4.7.0: Service release fixing 50+ issues! Getting Started A great place to start is with our Getting Started Guide: Getting Started Guide: http://umbraco.codeplex.com/Project/Download/FileDownload.aspx?DownloadId=197051 Make sure to check the free foundation videos on how to get started building Umbraco sites. They're available from: Introduction for webmasters: http://umbraco.tv/help-and-support/video-tutorials/getting-started Understand the Umbraco concepts: http://umbraco.tv/help-and-support...ProDinner - ASP.NET MVC EF4 Code First DDD jQuery Sample App: first release: ProDinner is an ASP.NET MVC sample application, it uses DDD, EF4 Code First for Data Access, jQuery and MvcProjectAwesome for Web UI, it has Multi-language User Interface Features: CRUD and search operations for entities Multi-Language User Interface upload and crop Images (make thumbnail) for meals pagination using "more results" button very rich and responsive UI (using Mvc Project Awesome) Multiple UI themes (using jQuery UI themes)BEPUphysics: BEPUphysics v0.15.1: Latest binary release. Version HistoryLiveChat Starter Kit: LCSK v1.1: This release contains couple of new features and bug fixes including: Features: Send chat transcript via email Operator can now invite visitor to chat (pro-active chat request) Bug Fixes: Operator management (Save and Delete) bug fixes Operator Console chat small fixesIronRuby: 1.1.3: IronRuby 1.1.3 is a servicing release that keeps on improving compatibility with Ruby 1.9.2 and includes IronRuby integration to Visual Studio 2010. We decided to drop 1.8.6 compatibility mode in all post-1.0 releases. We recommend using IronRuby 1.0 if you need 1.8.6 compatibility. The main purpose of this release is to sync with IronPython 2.7 release, i.e. to keep the Dynamic Language Runtime that both these languages build on top shareable. This release also fixes a few bugs: 5763 Use...SQL Server PowerShell Extensions: 2.3.2.1 Production: Release 2.3.2.1 implements SQLPSX as PowersShell version 2.0 modules. SQLPSX consists of 13 modules with 163 advanced functions, 2 cmdlets and 7 scripts for working with ADO.NET, SMO, Agent, RMO, SSIS, SQL script files, PBM, Performance Counters, SQLProfiler, Oracle and MySQL and using Powershell ISE as a SQL and Oracle query tool. In addition optional backend databases and SQL Server Reporting Services 2008 reports are provided with SQLServer and PBM modules. See readme file for details.IronPython: 2.7: On behalf of the IronPython team, I'm very pleased to announce the release of IronPython 2.7. This release contains all of the language features of Python 2.7, as well as several previously missing modules and numerous bug fixes. IronPython 2.7 also includes built-in Visual Studio support through IronPython Tools for Visual Studio. IronPython 2.7 requires .NET 4.0 or Silverlight 4. To download IronPython 2.7, visit http://ironpython.codeplex.com/releases/view/54498. Any bugs should be report...New Projects4711 Widget: Time Widget For 4711Baidu map api: Baidu map apiBookmooch.NET: A .NET wrapper for the Bookmooch.com API.Crash Cite - Electronic Crash Reporting Software: While we were in the electronic citation business - we wrote Indiana's eCWS - we decided to write our own crash reporting software. We exited from that biz because it's too politically charged. Here for you to use is an almost complete electronic crash reporting solution. Enjoy.Curso apb: Proyecto para gestionar un curso academico basado en apb. Ejemplo para aprendizaje personalEle: simple database changes log web applicationEmptor by Intrigue Deviation: An idea for collaboration and relations between customers and businesses.FixEd: FixEd is a text editor for working with text files that require enforcement of column widths and line lengths. Fixed-Width files...Typical from mainframe datasets or csv files. It provides a plugin parser system, so that files may be exported to and imported from any format.Free inventory: InventoryFurcadia Heimdall Tester: An application that helps Furcadia technicians test the integrity of the game server. It checks for availability of each heimdall, its connectivity to the rest of the system (horton/tribble) and how often it receives a user compared to the rest of them.GestOre: Software per gestire le ore di lavoro. Creazione di report mensili.Grauers Google Chart WebPart: SharePoint Foundation Google WebPart Chart. Create a custom list and connect Grauers Google Chart with the list. The Chart is interactive. Created by Ola GrauershOcr2Pdf.NET: hOcr2Pdf.NET is program/library to convert .hOcr files into a searchable pdf. It is currently being written and tested again the .hocr files products by the Tesseract ocr engine.Hyper-V Guest Console: This application provides a console for manage VMs use within a Hyper-V guest. With HVGC an Hyper-V guest can: - Start VMs - Shutdow VMs - Stop VMs - Monitoring VMs HVGC is develop in Visual Studio 2008 using VB.NET.JUpload.Net: JUpload.Net is an Asp Net control which encapsulates the popular Jupload java applet (http://jupload.sourceforge.net). myMvcBlog: my mvc blogNGI.Framework: NGI.FrameworkOnlineLicense: OnlineLicense is a PHP, mySQL and C# project. PHP and mySQL runing at Web Server side to store the online user information, and return the proper license for desktop applications which write by C#. SharePoint 2010 Managed Metadata Import Tool: Command-line SharePoint 2010 Managed Metadata import tool. Uses (almost) same CSV format as the Microsoft importer and supports static term GUIDs. Unlike the MS tool, this importer does not cause data corruption issues when taxonomy terms are used with Content Organizer rules.Sphere Layout Library: ItemsControl library targetting WPF, Silverlight for desktop and Silverlight for WP7, allowing to layout things on a 3d Sphere.TestProject for C#: Test project to test a Team Foundation Server.TfsUtils: Simple Project to store some TFS API utility code.This Is My New Project: FasdfTidy UI: Tidy UI is a "write less - do more" framework for .NET focusing on both client components and code for easy data access. The main goal of the framework is to enable ASP.NET developers to focus more on functionality and less on the technical aspects of the development process.TönnenKlapps: XNA game where you try to smash a 3D spinning barrel using the correct coloured buttons and the right timing.yihuaisha_Window: Glest research

Read the article
How to Load Oracle Tables From Hadoop Tutorial (Part 5 - Leveraging Parallelism in OSCH)

- by Bob Hanckel

Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4 Using OSCH: Beyond Hello World In the previous post we discussed a “Hello World” example for OSCH focusing on the mechanics of getting a toy end-to-end example working. In this post we are going to talk about how to make it work for big data loads. We will explain how to optimize an OSCH external table for load, paying particular attention to Oracle’s DOP (degree of parallelism), the number of external table location files we use, and the number of HDFS files that make up the payload. We will provide some rules that serve as best practices when using OSCH. The assumption is that you have read the previous post and have some end to end OSCH external tables working and now you want to ramp up the size of the loads. Using OSCH External Tables for Access and Loading OSCH external tables are no different from any other Oracle external tables. They can be used to access HDFS content using Oracle SQL: SELECT * FROM my_hdfs_external_table; or use the same SQL access to load a table in Oracle. INSERT INTO my_oracle_table SELECT * FROM my_hdfs_external_table; To speed up the load time, you will want to control the degree of parallelism (i.e. DOP) and add two SQL hints. ALTER SESSION FORCE PARALLEL DML PARALLEL 8; ALTER SESSION FORCE PARALLEL QUERY PARALLEL 8; INSERT /*+ append pq_distribute(my_oracle_table, none) */ INTO my_oracle_table SELECT * FROM my_hdfs_external_table; There are various ways of either hinting at what level of DOP you want to use. The ALTER SESSION statements above force the issue assuming you (the user of the session) are allowed to assert the DOP (more on that in the next section). Alternatively you could embed additional parallel hints directly into the INSERT and SELECT clause respectively. /*+ parallel(my_oracle_table,8) *//*+ parallel(my_hdfs_external_table,8) */ Note that the "append" hint lets you load a target table by reserving space above a given "high watermark" in storage and uses Direct Path load. In other doesn't try to fill blocks that are already allocated and partially filled. It uses unallocated blocks. It is an optimized way of loading a table without incurring the typical resource overhead associated with run-of-the-mill inserts. The "pq_distribute" hint in this context unifies the INSERT and SELECT operators to make data flow during a load more efficient. Finally your target Oracle table should be defined with "NOLOGGING" and "PARALLEL" attributes. The combination of the "NOLOGGING" and use of the "append" hint disables REDO logging, and its overhead. The "PARALLEL" clause tells Oracle to try to use parallel execution when operating on the target table. Determine Your DOP It might feel natural to build your datasets in Hadoop, then afterwards figure out how to tune the OSCH external table definition, but you should start backwards. You should focus on Oracle database, specifically the DOP you want to use when loading (or accessing) HDFS content using external tables. The DOP in Oracle controls how many PQ slaves are launched in parallel when executing an external table. Typically the DOP is something you want to Oracle to control transparently, but for loading content from Hadoop with OSCH, it's something that you will want to control. Oracle computes the maximum DOP that can be used by an Oracle user. The maximum value that can be assigned is an integer value typically equal to the number of CPUs on your Oracle instances, times the number of cores per CPU, times the number of Oracle instances. For example, suppose you have a RAC environment with 2 Oracle instances. And suppose that each system has 2 CPUs with 32 cores. The maximum DOP would be 128 (i.e. 2*2*32). In point of fact if you are running on a production system, the maximum DOP you are allowed to use will be restricted by the Oracle DBA. This is because using a system maximum DOP can subsume all system resources on Oracle and starve anything else that is executing. Obviously on a production system where resources need to be shared 24x7, this can’t be allowed to happen. The use cases for being able to run OSCH with a maximum DOP are when you have exclusive access to all the resources on an Oracle system. This can be in situations when your are first seeding tables in a new Oracle database, or there is a time where normal activity in the production database can be safely taken off-line for a few hours to free up resources for a big incremental load. Using OSCH on high end machines (specifically Oracle Exadata and Oracle BDA cabled with Infiniband), this mode of operation can load up to 15TB per hour. The bottom line is that you should first figure out what DOP you will be allowed to run with by talking to the DBAs who manage the production system. You then use that number to derive the number of location files, and (optionally) the number of HDFS data files that you want to generate, assuming that is flexible. Rule 1: Find out the maximum DOP you will be allowed to use with OSCH on the target Oracle system Determining the Number of Location Files Let’s assume that the DBA told you that your maximum DOP was 8. You want the number of location files in your external table to be big enough to utilize all 8 PQ slaves, and you want them to represent equally balanced workloads. Remember location files in OSCH are metadata lists of HDFS files and are created using OSCH’s External Table tool. They also represent the workload size given to an individual Oracle PQ slave (i.e. a PQ slave is given one location file to process at a time, and only it will process the contents of the location file.) Rule 2: The size of the workload of a single location file (and the PQ slave that processes it) is the sum of the content size of the HDFS files it lists For example, if a location file lists 5 HDFS files which are each 100GB in size, the workload size for that location file is 500GB. The number of location files that you generate is something you control by providing a number as input to OSCH’s External Table tool. Rule 3: The number of location files chosen should be a small multiple of the DOP Each location file represents one workload for one PQ slave. So the goal is to keep all slaves busy and try to give them equivalent workloads. Obviously if you run with a DOP of 8 but have 5 location files, only five PQ slaves will have something to do and the other three will have nothing to do and will quietly exit. If you run with 9 location files, then the PQ slaves will pick up the first 8 location files, and assuming they have equal work loads, will finish up about the same time. But the first PQ slave to finish its job will then be rescheduled to process the ninth location file, potentially doubling the end to end processing time. So for this DOP using 8, 16, or 32 location files would be a good idea. Determining the Number of HDFS Files Let’s start with the next rule and then explain it: Rule 4: The number of HDFS files should try to be a multiple of the number of location files and try to be relatively the same size In our running example, the DOP is 8. This means that the number of location files should be a small multiple of 8. Remember that each location file represents a list of unique HDFS files to load, and that the sum of the files listed in each location file is a workload for one Oracle PQ slave. The OSCH External Table tool will look in an HDFS directory for a set of HDFS files to load. It will generate N number of location files (where N is the value you gave to the tool). It will then try to divvy up the HDFS files and do its best to make sure the workload across location files is as balanced as possible. (The tool uses a greedy algorithm that grabs the biggest HDFS file and delegates it to a particular location file. It then looks for the next biggest file and puts in some other location file, and so on). The tools ability to balance is reduced if HDFS file sizes are grossly out of balance or are too few. For example suppose my DOP is 8 and the number of location files is 8. Suppose I have only 8 HDFS files, where one file is 900GB and the others are 100GB. When the tool tries to balance the load it will be forced to put the singleton 900GB into one location file, and put each of the 100GB files in the 7 remaining location files. The load balance skew is 9 to 1. One PQ slave will be working overtime, while the slacker PQ slaves are off enjoying happy hour. If however the total payload (1600 GB) were broken up into smaller HDFS files, the OSCH External Table tool would have an easier time generating a list where each workload for each location file is relatively the same. Applying Rule 4 above to our DOP of 8, we could divide the workload into160 files that were approximately 10 GB in size. For this scenario the OSCH External Table tool would populate each location file with 20 HDFS file references, and all location files would have similar workloads (approximately 200GB per location file.) As a rule, when the OSCH External Table tool has to deal with more and smaller files it will be able to create more balanced loads. How small should HDFS files get? Not so small that the HDFS open and close file overhead starts having a substantial impact. For our performance test system (Exadata/BDA with Infiniband), I compared three OSCH loads of 1 TiB. One load had 128 HDFS files living in 64 location files where each HDFS file was about 8GB. I then did the same load with 12800 files where each HDFS file was about 80MB size. The end to end load time was virtually the same. However when I got ridiculously small (i.e. 128000 files at about 8MB per file), it started to make an impact and slow down the load time. What happens if you break rules 3 or 4 above? Nothing draconian, everything will still function. You just won’t be taking full advantage of the generous DOP that was allocated to you by your friendly DBA. The key point of the rules articulated above is this: if you know that HDFS content is ultimately going to be loaded into Oracle using OSCH, it makes sense to chop them up into the right number of files roughly the same size, derived from the DOP that you expect to use for loading. Next Steps So far we have talked about OLH and OSCH as alternative models for loading. That’s not quite the whole story. They can be used together in a way that provides for more efficient OSCH loads and allows one to be more flexible about scheduling on a Hadoop cluster and an Oracle Database to perform load operations. The next lesson will talk about Oracle Data Pump files generated by OLH, and loaded using OSCH. It will also outline the pros and cons of using various load methods. This will be followed up with a final tutorial lesson focusing on how to optimize OLH and OSCH for use on Oracle's engineered systems: specifically Exadata and the BDA. /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin;}

Read the article
Fastest way to parse XML files in C#?

- by LifeH2O

I have to load many XML files from internet. But for testing with better speed i downloaded all of them (more than 500 files) of the following format. <player-profile> <personal-information> <id>36</id> <fullname>Adam Gilchrist</fullname> <majorteam>Australia</majorteam> <nickname>Gilchrist</nickname> <shortName>A Gilchrist</shortName> <dateofbirth>Nov 14, 1971</dateofbirth> <battingstyle>Left-hand bat</battingstyle> <bowlingstyle>Right-arm offbreak</bowlingstyle> <role>Wicket-Keeper</role> <teams-played-for>Western Australia, New South Wales, ICC World XI, Deccan Chargers, Australia</teams-played-for> <iplteam>Deccan Chargers</iplteam> </personal-information> <batting-statistics> <odi-stats> <matchtype>ODI</matchtype> <matches>287</matches> <innings>279</innings> <notouts>11</notouts> <runsscored>9619</runsscored> <highestscore>172</highestscore> <ballstaken>9922</ballstaken> <sixes>149</sixes> <fours>1000+</fours> <ducks>0</ducks> <fifties>55</fifties> <catches>417</catches> <stumpings>55</stumpings> <hundreds>16</hundreds> <strikerate>96.95</strikerate> <average>35.89</average> </odi-stats> <test-stats> . . . </test-stats> <t20-stats> . . . </t20-stats> <ipl-stats> . . . </ipl-stats> </batting-statistics> <bowling-statistics> <odi-stats> . . . </odi-stats> <test-stats> . . . </test-stats> <t20-stats> . . . </t20-stats> <ipl-stats> . . . </ipl-stats> </bowling-statistics> </player-profile> I am using XmlNodeList list = _document.SelectNodes("/player-profile/batting-statistics/odi-stats"); And then loop this list with foreach as foreach (XmlNode stats in list) { _btMatchType = GetInnerString(stats, "matchtype"); //it returns null string if node not availible . . . . _btAvg = Convert.ToDouble(stats["average"].InnerText); } Even i am loading all files offline, parsing is very slow Is there any good faster way to parse them? Or is it problem with SQL? I am saving all extracted data from XML to database using DataSets, TableAdapters with insert command. I

Read the article

Developer IT

datasets - Page 15 - Developer IT

Search Results

Search found 356 results on 15 pages for 'datasets'.

Page 15/15 | < Previous Page | 11 12 13 14 15

When someone deletes a shared data source in SSRS

- by Rob Farley

Read the article

Oracle BI Server Modeling, Part 1- Designing a Query Factory

- by bob.ertl(at)oracle.com

Read the article

CodePlex Daily Summary for Wednesday, November 21, 2012

Read the article

CodePlex Daily Summary for Friday, March 18, 2011

Read the article

How to Load Oracle Tables From Hadoop Tutorial (Part 5 - Leveraging Parallelism in OSCH)

- by Bob Hanckel

Read the article

Fastest way to parse XML files in C#?

- by LifeH2O

Read the article

< Previous Page | 11 12 13 14 15