Search Results

Search found 9371 results on 375 pages for 'existing'.

Page 326/375 | < Previous Page | 322 323 324 325 326 327 328 329 330 331 332 333 | Next Page >

Universities 2030: Learning from the Past to Anticipate the Future

- by Mohit Phogat

What will the landscape of international higher education look like a generation from now? What challenges and opportunities lie ahead for universities, especially “global” research universities? And what can university leaders do to prepare for the major social, economic, and political changes—both foreseen and unforeseen—that may be on the horizon? The nine essays in this collection proceed on the premise that one way to envision “the global university” of the future is to explore how earlier generations of university leaders prepared for “global” change—or at least responded to change—in the past. As the essays in this collection attest, many of the patterns associated with contemporary “globalization” or “internationalization” are not new; similar processes have been underway for a long time (some would say for centuries).[1] A comparative-historical look at universities’ responses to global change can help today’s higher-education leaders prepare for the future. Written by leading historians of higher education from around the world, these nine essays identify “key moments” in the internationalization of higher education: moments when universities and university leaders responded to new historical circumstances by reorienting their relationship with the broader world. Covering more than a century of change—from the late nineteenth century to the early twenty-first—they explore different approaches to internationalization across Europe, Asia, Australia, North America, and South America. Notably, while the choice of historical eras was left entirely open, the essays converged around four periods: the 1880s and the international extension of the “modern research university” model; the 1930s and universities’ attempts to cope with international financial and political crises; the 1960s and universities’ role in an emerging postcolonial international development apparatus; and the 2000s and the rise of neoliberal efforts to reform universities in the name of international economic “competitiveness.” Each of these four periods saw universities adopt new approaches to internationalization in response to major historical-structural changes, and each has clear parallels to today. Among the most important historical-structural challenges that universities confronted were: (1) fluctuating enrollments and funding resources associated with global economic booms and busts; (2) new modes of transportation and communication that facilitated mobility (among students, scholars, and knowledge itself); (3) increasing demands for applied science, technical expertise, and commercial innovation; and (4) ideological reconfigurations accompanying regime changes (e.g., from one internal regime to another, from colonialism to postcolonialism, from the cold war to globalized capitalism, etc.). Like universities today, universities in the past responded to major historical-structural changes by internationalizing: by joining forces across space to meet new expectations and solve problems on an ever-widening scale. Approaches to internationalization have typically built on prior cultural or institutional ties. In general, only when the benefits of existing ties had been exhausted did universities reach out to foreign (or less familiar) partners. As one might expect, this process of “reaching out” has stretched universities’ traditional cultural, political, and/or intellectual bonds and has invariably presented challenges, particularly when national priorities have differed—for example, with respect to curricular programs, governance structures, norms of academic freedom, etc. Strategies of university internationalization that either ignore or downplay cultural, political, or intellectual differences often fail, especially when the pursuit of new international connections is perceived to weaken national ties. If the essays in this collection agree on anything, they agree that approaches to internationalization that seem to “de-nationalize” the university usually do not succeed (at least not for long). Please continue reading the other essays at http://globalhighered.wordpress.com/

Read the article
Q&A: Drive Online Engagement with Intuitive Portals and Websites

- by kellsey.ruppel

We had a great webcast yesterday and wanted to recap the questions that were asked throughout. Can ECM distribute contents to 3rd party sites?ECM, which is now called WebCenter Content can distribute content to 3rd party sites via several means as well as SSXA - Site Studio for External Applications. Will you be able to provide more information on these means and SSXA?If you have an existing JSP application, you can add the SSXA libraries to your IDE where your application was built (JDeveloper for example). You can now drop some code into your 3rd party site/application that can both create and pull dynamically contributable content out of the Content Server for inclusion in your pages. If the 3rd party site is not a JSP application, there is also the option of leveraging two Site Studio (not SSXA) specific custom WebCenter Content services to pull Site Studio XML content into a page. More information on SSXA can be found here: http://docs.oracle.com/cd/E17904_01/doc.1111/e13650/toc.htm Is there another way than a ”gadget” to integrate applications (like loan simulator) in WebCenter Sites?There are some other ways such as leveraging the Pagelet Producer, which is a core component of WebCenter Portal. Oracle WebCenter Portal's Pagelet Producer (previously known as Oracle WebCenter Ensemble) provides a collection of useful tools and features that facilitate dynamic pagelet development. A pagelet is a reusable user interface component. Any HTML fragment can be a pagelet, but pagelet developers can also write pagelets that are parameterized and configurable, to dynamically interact with other pagelets, and respond to user input. Pagelets are similar to portlets, but while portlets were designed specifically for portals, pagelets can be run on any web page, including within a portal or other web application. Pagelets can be used to expose platform-specific portlets in other web environments. More on Page Producer can be found here: http://docs.oracle.com/cd/E23943_01/webcenter.1111/e10148/jpsdg_pagelet.htm#CHDIAEHG Can you describe the mechanism available to achieve the context transfer of content?The primary goal of context transfer is to provide a uniform experience to customers as they transition from one channel to another, for instance in the use-case discussed in the webcast, it was around a customer moving from the .com marketing website to the self-service site where the customer wants to manage his account information. However if WebCenter Sites was able to identify and segment the customers to a specific category where the customer is a potential target for some promotions, the same promotions should be targeted to the customer when he is in the self-service site, which is managed by WebCenter Portal. The context transfer can be achieved by calling out the WebCenter Sites Engage Server API’s, which will identify the segment that the customer has been bucketed into. Again through REST API’s., WebCenter Portal can then request WebCenter Sites for specific content that needs to be targeted for a customer for the identified segment. While this integration can be achieved through custom integration today, Oracle is looking into productizing this integration in future releases. How can context be transferred from WebCenter Sites (marketing site) to WebCenter Portal (Online services)?WebCenter Portal Personalization server can call into WebCenter Sites Engage Server to identify the segment for the user and then through REST API’s request specific content that needs to be surfaced in the Portal. Still have questions? Leave them in the comments section! And you can catch a replay of the webcast here.

Read the article
MPI Cluster Debugger launch integration in VS2010

Let's assume that you have all the HPC bits installed and that you have existing MPI code (or you created a "Hello World" project using the MPI project template). Of course, you create a single MPI application and at runtime it will correspond to multiple processes (of the same app) launched on multiple nodes (i.e. machines) on the cluster. So how do you debug such a situation by simply hitting the familiar "F5" keystroke (i.e. Debug - Start Debugging)?WATCH IT INSTEAD OF READING ABOUT ITIf you can't bear to read through all the details below, just watch this 19-minute screencast explaining this VS2010 feature. Alternatively, or even additionally, keep on reading.REQUIREMENTWhen you debug an MPI application, you would want the copying of resources from your client machine (where Visual Studio is installed) to each compute node (where Windows HPC Server is installed) to take place automatically for you. 'Resources' in the previous sentence includes your application binary, plus any binary or data dependencies it may have, plus PDBs if needed, plus the debug CRT of the correct bitness, plus msvsmon for remote debugging to work. You would also want, after copying is complete, to have your app and msvsmon launched and attached so that you can hit breakpoints back in Visual Studio on your client machine. All these thing that you would want are delivered in VS2010.STEPS TO F51. In your MPI project where you have placed a breakpoint go to Project Properties - Configuration Properties - Debugging. Ensure the "Debugger to launch" combo box value is set to MPI Cluster Debugger.2. There are a whole bunch of properties here and typically you can ignore all of them except one: Run Environment. By default it is set to run 1 process on your local machine and if you change the number after that to, for example, 4 it will launch 4 processes of your app on your local machine.You want this to run on your cluster though, so go to the dropdown arrow at the end of the Run Environment cell and open it to expose the "Edit Hpc node" menu which opens the Node Selector dialog:In this dialog you can enter (or pick from a list) the cluster head node name and then the number of processes you want to execute on the cluster and then hit OK and… you are done.3. Press F5 and watch your breakpoint get hit (after giving it some time for copying, remote execution, attachment and symbol resolution to take place).GOING DEEPERIn the MPI Cluster Debugger project properties above, you can see many additional properties to the Run Environment. They are all optional, but you may want to understand them in order to fine tune your cluster debugging. Read all about each one of these on the MSDN page Configuration Properties for the MPI Cluster Debugger.In the Node Selector dialog above you can see more options than just the Head Node name and Number of Process to run. They should be self-explanatory but I also cover them in depth in my screencast showing you an example of why you would choose to schedule processes per core versus per node. You can also read about these options on MSDN as part of the page How to: Configure and Launch the MPI Cluster Debugger.To read through an example that touches on MPI project creation, project properties, node selector, and also usage of MPI with OpenMP plus MPI with PPL, read the MSDN page Walkthrough: Launching the MPI Cluster Debugger in Visual Studio 2010.Happy MPI debugging! Comments about this post welcome at the original blog.

Read the article
Columnstore Case Study #1: MSIT SONAR Aggregations

- by aspiringgeek

Preamble This is the first in a series of posts documenting big wins encountered using columnstore indexes in SQL Server 2012 & 2014. Many of these can be found in this deck along with details such as internals, best practices, caveats, etc. The purpose of sharing the case studies in this context is to provide an easy-to-consume quick-reference alternative. Why Columnstore? If we’re looking for a subset of columns from one or a few rows, given the right indexes, SQL Server can do a superlative job of providing an answer. If we’re asking a question which by design needs to hit lots of rows—DW, reporting, aggregations, grouping, scans, etc., SQL Server has never had a good mechanism—until columnstore. Columnstore indexes were introduced in SQL Server 2012. However, they're still largely unknown. Some adoption blockers existed; yet columnstore was nonetheless a game changer for many apps. In SQL Server 2014, potential blockers have been largely removed & they're going to profoundly change the way we interact with our data. The purpose of this series is to share the performance benefits of columnstore & documenting columnstore is a compelling reason to upgrade to SQL Server 2014. App: MSIT SONAR Aggregations At MSIT, performance & configuration data is captured by SCOM. We archive much of the data in a partitioned data warehouse table in SQL Server 2012 for reporting via an application called SONAR. By definition, this is a primary use case for columnstore—report queries requiring aggregation over large numbers of rows. New data is refreshed each night by an automated table partitioning mechanism—a best practices scenario for columnstore. The Win Compared to performance using classic indexing which resulted in the expected query plan selection including partition elimination vs. SQL Server 2012 nonclustered columnstore, query performance increased significantly. Logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more. Other than creating the columnstore index, no special modifications or tweaks to the app or databases schema were necessary to achieve the performance improvements. Existing nonclustered indexes were rendered superfluous & were deleted, thus mitigating maintenance challenges such as defragging as well as conserving disk capacity. Details The table provides the raw data & summarizes the performance deltas. Logical Reads (8K pages) CPU (ms) Durn (ms) Columnstore 160,323 20,360 9,786 Conventional Table & Indexes 9,053,423 549,608 193,903 ? x56 x27 x20 The charts provide additional perspective of this data. "Conventional vs. Columnstore Metrics" document the raw data. Note on this linear display the magnitude of the conventional index performance vs. columnstore. The “Metrics (?)” chart expresses these values as a ratio. Summary For DW, reports, & other BI workloads, columnstore often provides significant performance enhancements relative to conventional indexing. I have documented here, the first in a series of reports on columnstore implementations, results from an initial implementation at MSIT in which logical reads were reduced by over a factor of 50; both CPU & duration improved by factors of 20 or more. Subsequent features in this series document performance enhancements that are even more significant.

Read the article
Q&A: Drive Online Engagement with Intuitive Portals and Websites

- by kellsey.ruppel

We had a great webcast yesterday and wanted to recap the questions that were asked throughout. Can ECM distribute contents to 3rd party sites?ECM, which is now called WebCenter Content can distribute content to 3rd party sites via several means as well as SSXA - Site Studio for External Applications. Will you be able to provide more information on these means and SSXA?If you have an existing JSP application, you can add the SSXA libraries to your IDE where your application was built (JDeveloper for example). You can now drop some code into your 3rd party site/application that can both create and pull dynamically contributable content out of the Content Server for inclusion in your pages. If the 3rd party site is not a JSP application, there is also the option of leveraging two Site Studio (not SSXA) specific custom WebCenter Content services to pull Site Studio XML content into a page. More information on SSXA can be found here: http://docs.oracle.com/cd/E17904_01/doc.1111/e13650/toc.htm Is there another way than a ”gadget” to integrate applications (like loan simulator) in WebCenter Sites?There are some other ways such as leveraging the Pagelet Producer, which is a core component of WebCenter Portal. Oracle WebCenter Portal's Pagelet Producer (previously known as Oracle WebCenter Ensemble) provides a collection of useful tools and features that facilitate dynamic pagelet development. A pagelet is a reusable user interface component. Any HTML fragment can be a pagelet, but pagelet developers can also write pagelets that are parameterized and configurable, to dynamically interact with other pagelets, and respond to user input. Pagelets are similar to portlets, but while portlets were designed specifically for portals, pagelets can be run on any web page, including within a portal or other web application. Pagelets can be used to expose platform-specific portlets in other web environments. More on Page Producer can be found here:http://docs.oracle.com/cd/E23943_01/webcenter.1111/e10148/jpsdg_pagelet.htm#CHDIAEHG Can you describe the mechanism available to achieve the context transfer of content?The primary goal of context transfer is to provide a uniform experience to customers as they transition from one channel to another, for instance in the use-case discussed in the webcast, it was around a customer moving from the .com marketing website to the self-service site where the customer wants to manage his account information. However if WebCenter Sites was able to identify and segment the customers to a specific category where the customer is a potential target for some promotions, the same promotions should be targeted to the customer when he is in the self-service site, which is managed by WebCenter Portal. The context transfer can be achieved by calling out the WebCenter Sites Engage Server API’s, which will identify the segment that the customer has been bucketed into. Again through REST API’s., WebCenter Portal can then request WebCenter Sites for specific content that needs to be targeted for a customer for the identified segment. While this integration can be achieved through custom integration today, Oracle is looking into productizing this integration in future releases. How can context be transferred from WebCenter Sites (marketing site) to WebCenter Portal (Online services)?WebCenter Portal Personalization server can call into WebCenter Sites Engage Server to identify the segment for the user and then through REST API’s request specific content that needs to be surfaced in the Portal. Still have questions? Leave them in the comments section! And you can catch a replay of the webcast here.

Read the article
Head in the Clouds

- by Tony Davis

We're just past the second anniversary of the launch of Windows Azure. A couple of years' experience with Azure in the industry has provided some obvious success stories, but has deflated some of the initial marketing hyperbole. As a general principle, Azure seems to work well in providing a Service-Oriented Architecture for services in enterprises that suffer wide fluctuations in demand. Instead of being obliged to provide hardware sufficient for the occasional peaks in demand, one can hire capacity only when it is needed, and the cost of hosting an application is no longer a capital cost. It enables companies to avoid having to scale out hardware for peak periods only to see it underused for the rest of the time. A customer-facing application such as a concert ticketing system, which suffers high demand in short, predictable bursts of activity, is a great example of an application that would work well in Azure. However, moving existing applications to Azure isn't something to be done on impulse. Unless your application is .NET-based, and consists of 'stateless' components that communicate via queues, you are probably in for a lot of redevelopment work. It makes most sense for IT departments who are already deep in this .NET mindset, and who also want 'grown-up' methods of staging, testing, and deployment. Azure fits well with this culture and offers, as a bonus, good Visual Studio integration. The most-commonly stated barrier to porting these applications to Azure is the problem of reconciling the use of the cloud with legislation for data privacy and security. Putting databases in the cloud is a sticky issue for many and impossible for some due to compliance and security issues, the need for direct control over data, and so on. In the face of feedback from the early adopters of Azure, Microsoft has broadened the architectural choices to cater for a wide range of requirements. As well as SQL Azure Database (SAD) and Azure storage, the unstructured 'BLOB and Entity-Attribute-Value' NoSQL storage alternative (which equates more closely with folders and files than a database), Windows Azure offers a wide range of storage options including use of services such as oData: developers who are programming for Windows Azure can simply choose the one most appropriate for their needs. Secondly, and crucially, the Windows Azure architecture allows you the freedom to produce hybrid applications, where only those parts that need cloud-based hosting are deployed to Azure, whereas those parts that must unavoidably be hosted in a corporate datacenter can stay there. By using a hybrid architecture, it will seldom, if ever, be necessary to move an entire application to the cloud, along with personal and financial data. For example that we could port to Azure only put those parts of our ticketing application that capture and process tickets orders. Once an order is captured, the financial side can be processed in our own data center. In short, Windows Azure seems to be a very effective way of providing services that are subject to wide but predictable fluctuations in demand. Have you come to the same conclusions, or do you think I've got it wrong? If you've had experience with Azure, would you recommend it? It would be great to hear from you. Cheers, Tony.

Read the article
Mobile Deals: the Consumer Wants You in Their Pocket

- by Mike Stiles

Mobile deals offer something we talk about a lot in social marketing, relevant content. If a consumer is already predisposed to liking your product and gets a timely deal for it that’s easy and convenient to use, not only do you score on the marketing side, it clearly generates some of that precious ROI that’s being demanded of social. First, a quick gut-check on the public’s adoption of mobile. Nielsen figures have 55.5% of US mobile owners using smartphones. If young people are indeed the future, you can count on the move to mobile exploding exponentially. Teens are the fastest growing segment of smartphone users, and 58% of them have one. But the largest demographic of smartphone users is 25-34 at 74%. That tells you a focus on mobile will yield great results now, and even better results straight ahead. So we can tell both from statistics and from all the faces around you that are buried in their smartphones this is where consumers are. But are they looking at you? Do you have a valid reason why they should? Everybody likes a good deal. BIA/Kelsey says US consumers will spend $3.6 billion this year for daily deals (the Groupons and LivingSocials of the world), up 87% from 2011. The report goes on to say over 26% of small businesses are either "very likely" or "extremely likely" to offer up a deal in the next 6 months. Retail Gazette reports 58% of consumers shop with coupons, a 40% increase in 4 years. When you consider that a deal can be the impetus for a real-world transaction, a first-time visit to a store, an online purchase, entry into a loyalty program, a social referral, a new fan or follower, etc., that 26% figure shows us there’s a lot of opportunity being left on the table by brands. The existing and emerging technologies behind mobile devices make the benefits of offering deals listed above possible. Take how mobile payment systems are being tied into deal delivery and loyalty programs. If it’s really easy to use a coupon or deal, it’ll get used. If it’s complicated, it’ll be passed over as “not worth it.” When you can pay with your mobile via technologies that connects store and user, you get the deal, you get the loyalty credit, you pay, and your receipt is uploaded, all in one easy swipe. Nothing to keep track of, nothing to lose or forget about. And the store “knows” you, so future offers will be based on your tastes. Consider the endgame. A customer who’s a fan of your belt buckle store’s Facebook Page is in one of your physical retail locations. They pull up your app, because they’ve gotten used to a loyalty deal being offered when they go to your store. Voila. A 10% discount active for the next 30 minutes. Maybe the app also surfaces social references to your brand made by friends so they can check out a buckle someone’s raving about. If they aren’t a fan of your Page or don’t have your app, perhaps they’ve opted into location-based deal services so you can still get them that 10% deal while they’re in the store. Or maybe they’ve walked in with a pre-purchased Groupon or Living Social voucher. They pay with one swipe, and you’ve learned about their buying preferences, credited their loyalty account and can encourage them to share a pic of their new buckle on social. Happy customer. Happy belt buckle company. All because the brand was willing to use the tech that’s available to meet consumers where they are, incentivize them, and show them how much they’re valued through rewards.

Read the article
Caching factory design

- by max

I have a factory class XFactory that creates objects of class X. Instances of X are very large, so the main purpose of the factory is to cache them, as transparently to the client code as possible. Objects of class X are immutable, so the following code seems reasonable: # module xfactory.py import x class XFactory: _registry = {} def get_x(self, arg1, arg2, use_cache = True): if use_cache: hash_id = hash((arg1, arg2)) if hash_id in _registry: return _registry[hash_id] obj = x.X(arg1, arg2) _registry[hash_id] = obj return obj # module x.py class X: # ... Is it a good pattern? (I know it's not the actual Factory Pattern.) Is there anything I should change? Now, I find that sometimes I want to cache X objects to disk. I'll use pickle for that purpose, and store as values in the _registry the filenames of the pickled objects instead of references to the objects. Of course, _registry itself would have to be stored persistently (perhaps in a pickle file of its own, in a text file, in a database, or simply by giving pickle files the filenames that contain hash_id). Except now the validity of the cached object depends not only on the parameters passed to get_x(), but also on the version of the code that created these objects. Strictly speaking, even a memory-cached object could become invalid if someone modifies x.py or any of its dependencies, and reloads it while the program is running. So far I ignored this danger since it seems unlikely for my application. But I certainly cannot ignore it when my objects are cached to persistent storage. What can I do? I suppose I could make the hash_id more robust by calculating hash of a tuple that contains arguments arg1 and arg2, as well as the filename and last modified date for x.py and every module and data file that it (recursively) depends on. To help delete cache files that won't ever be useful again, I'd add to the _registry the unhashed representation of the modified dates for each record. But even this solution isn't 100% safe since theoretically someone might load a module dynamically, and I wouldn't know about it from statically analyzing the source code. If I go all out and assume every file in the project is a dependency, the mechanism will still break if some module grabs data from an external website, etc.). In addition, the frequency of changes in x.py and its dependencies is quite high, leading to heavy cache invalidation. Thus, I figured I might as well give up some safety, and only invalidate the cache only when there is an obvious mismatch. This means that class X would have a class-level cache validation identifier that should be changed whenever the developer believes a change happened that should invalidate the cache. (With multiple developers, a separate invalidation identifier is required for each.) This identifier is hashed along with arg1 and arg2 and becomes part of the hash keys stored in _registry. Since developers may forget to update the validation identifier or not realize that they invalidated existing cache, it would seem better to add another validation mechanism: class X can have a method that returns all the known "traits" of X. For instance, if X is a table, I might add the names of all the columns. The hash calculation will include the traits as well. I can write this code, but I am afraid that I'm missing something important; and I'm also wondering if perhaps there's a framework or package that can do all of this stuff already. Ideally, I'd like to combine in-memory and disk-based caching.

Read the article
Investigating on xVelocity (VertiPaq) column size

- by Marco Russo (SQLBI)

In January I published an article about how to optimize high cardinality columns in VertiPaq. In the meantime, VertiPaq has been rebranded to xVelocity: the official name is now “xVelocity in-memory analytics engine (VertiPaq)” but using xVelocity and VertiPaq when we talk about Analysis Services has the same meaning. In this post I’ll show how to investigate on columns size of an existing Tabular database so that you can find the most important columns to be optimized. A first approach can be looking in the DataDir of Analysis Services and look for the folder containing the database. Then, look for the biggest files in all subfolders and you will find the name of a file that contains the name of the most expensive column. However, this heuristic process is not very optimized. A better approach is using a DMV that provides the exact information. For example, by using the following query (open SSMS, open an MDX query on the database you are interested to and execute it) you will see all database objects sorted by used size in a descending way. SELECT * FROM $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMN_SEGMENTS ORDER BY used_size DESC You can look at the first rows in order to understand what are the most expensive columns in your tabular model. The interesting data provided are: TABLE_ID: it is the name of the object – it can be also a dictionary or an index COLUMN_ID: it is the column name the object belongs to – you can also see ID_TO_POS and POS_TO_ID in case they refer to internal indexes RECORDS_COUNT: it is the number of rows in the column USED_SIZE: it is the used memory for the object By looking at the ration between USED_SIZE and RECORDS_COUNT you can understand what you can do in order to optimize your tabular model. Your options are: Remove the column. Yes, if it contains data you will never use in a query, simply remove the column from the tabular model Change granularity. If you are tracking time and you included milliseconds but seconds would be enough, round the data source column to the nearest second. If you have a floating point number but two decimals are good enough (i.e. the temperature), round the number to the nearest decimal is relevant to you. Split the column. Create two or more columns that have to be combined together in order to produce the original value. This technique is described in VertiPaq optimization article. Sort the table by that column. When you read the data source, you might consider sorting data by this column, so that the compression will be more efficient. However, this technique works better on columns that don’t have too many distinct values and you will probably move the problem to another column. Sorting data starting from the lower density columns (those with a few number of distinct values) and going to higher density columns (those with high cardinality) is the technique that provides the best compression ratio. After the optimization you should be able to reduce the used size and improve the count/size ration you measured before. If you are interested in a longer discussion about internal storage in VertiPaq and you want understand why this approach can save you space (and time), you can attend my 24 Hours of PASS session “VertiPaq Under the Hood” on March 21 at 08:00 GMT.

Read the article
Know your Data Lineage

- by Simon Elliston Ball

An academic paper without the footnotes isn’t an academic paper. Journalists wouldn’t base a news article on facts that they can’t verify. So why would anyone publish reports without being able to say where the data has come from and be confident of its quality, in other words, without knowing its lineage. (sometimes referred to as ‘provenance’ or ‘pedigree’) The number and variety of data sources, both traditional and new, increases inexorably. Data comes clean or dirty, processed or raw, unimpeachable or entirely fabricated. On its journey to our report, from its source, the data can travel through a network of interconnected pipes, passing through numerous distinct systems, each managed by different people. At each point along the pipeline, it can be changed, filtered, aggregated and combined. When the data finally emerges, how can we be sure that it is right? How can we be certain that no part of the data collection was based on incorrect assumptions, that key data points haven’t been left out, or that the sources are good? Even when we’re using data science to give us an approximate or probable answer, we cannot have any confidence in the results without confidence in the data from which it came. You need to know what has been done to your data, where it came from, and who is responsible for each stage of the analysis. This information represents your data lineage; it is your stack-trace. If you’re an analyst, suspicious of a number, it tells you why the number is there and how it got there. If you’re a developer, working on a pipeline, it provides the context you need to track down the bug. If you’re a manager, or an auditor, it lets you know the right things are being done. Lineage tracking is part of good data governance. Most audit and lineage systems require you to buy into their whole structure. If you are using Hadoop for your data storage and processing, then tools like Falcon allow you to track lineage, as long as you are using Falcon to write and run the pipeline. It can mean learning a new way of running your jobs (or using some sort of proxy), and even a distinct way of writing your queries. Other Hadoop tools provide a lot of operational and audit information, spread throughout the many logs produced by Hive, Sqoop, MapReduce and all the various moving parts that make up the eco-system. To get a full picture of what’s going on in your Hadoop system you need to capture both Falcon lineage and the data-exhaust of other tools that Falcon can’t orchestrate. However, the problem is bigger even that that. Often, Hadoop is just one piece in a larger processing workflow. The next step of the challenge is how you bind together the lineage metadata describing what happened before and after Hadoop, where ‘after’ could be a data analysis environment like R, an application, or even directly into an end-user tool such as Tableau or Excel. One possibility is to push as much as you can of your key analytics into Hadoop, but would you give up the power, and familiarity of your existing tools in return for a reliable way of tracking lineage? Lineage and auditing should work consistently, automatically and quietly, allowing users to access their data with any tool they require to use. The real solution, therefore, is to create a consistent method by which to bring lineage data from these data various disparate sources into the data analysis platform that you use, rather than being forced to use the tool that manages the pipeline for the lineage and a different tool for the data analysis. The key is to keep your logs, keep your audit data, from every source, bring them together and use the data analysis tools to trace the paths from raw data to the answer that data analysis provides.

Read the article
Web Developer - How to enhance my skillset?

- by atif089

First of all pardon my English. I am not a native English speaker I have been a Web Developer for the past 4 years. In these 4 years I have spent my time on the internet to learn things. My current skillset comprises of HTML CSS PHP MySQL jQuery (I would not say js and rather say jQuery because I am good at using jQuery and bad with plain javascript.) The above things seemed like an easier part of my life as I quickly learned them. But now I would really like to enhance my skillset and I am pretty confused which way to move ahead considering that I have to learn things using the web and references on my own. Design My first option is towards design. Shall I get started with design and start using Adobe Illustrator, Photoshop, Flash, Flex. Designing along with my previous skills looks like a money maker to me. As both are co-related to each other when web design is considered. And its easier to learn the first 2 and I hope I can get tutorials for the last 2 as well. Marketing A lot of my existing clients asked me if I do SEO. So this looked as a good field to me as well. I cannot estimate the scope of SEO but I assume it has a long future. Since I am business minded as well and there are a lot of tutorials around, should I start with SEO, SEM, Social Media, PPC or whatever it consists of. Software Development The complex plight and hardest thing (perhaps) but the easiest way to find a decent job in my location. If I go for software development what platform should be that I should be ideally going after? Should it be C# for windows development, or ASP.NET (once again enhances my skill set), J2EE (there are a lot of jobs for J2EE developers here) or plain C and C++. Also I think it is difficult to learn software languages right from Hello World, using internet? I have no clue how I learned PHP but I am sort of a pro now, but these other languages seems like a disaster to me? I cant figure out the reason if its because PHP is easier or there was a lot of tutorials around for PHP. Anyways is it also possible to learn software development right from Hello World using the web? Database / Server (Linux) / Network Administration Seems like a job with a decent pay but less number of jobs and a bit harder to learn online. (not sure) What should be the right track I should move ahead. P.S - Age is not a constraint for me as I am between 20-21, and I come from an IT background. I know quite little basics about C (upto structures) C++ (upto objects, I was not able to understand templates) Core Java (some basics and OOP concept) RDBMS Visual Basic 6 (used to do this long back) UNIX (a bunch of commands like who, finger, chmod, ls and a bit of #bash) Or is there anything else that I left out? I need you guys to please give me a feedback and the reason why I should select that field.

Read the article
The battle between Java vs. C#

The battle between Java vs. C# has been a big debate amongst the development community over the last few years. Both languages have specific pros and cons based on the needs of a particular project. In general both languages utilize a similar coding syntax that is based on C++, and offer developers similar functionality. This being said, the communities supporting each of these languages are very different. The divide amongst the communities is much like the political divide in America, where the Java community would represent the Democrats and the .Net community would represent the Republicans. The Democratic Party is a proponent of the working class and the general population. Currently, Java is deeply entrenched in the open source community that is distributed freely to anyone who has an interest in using it. Open source communities rely on developers to keep it alive by constantly contributing code to make applications better; essentially they develop code by the community. This is in stark contrast to the C# community that is typically a pay to play community meaning that you must pay for code that you want to use because it is developed as products to be marketed and sold for a profit. This ties back into my reference to the Republicans because they typically represent the needs of business and personal responsibility. This is emphasized by the belief that code is a commodity and that it can be sold for a profit which is in direct conflict to the laissez-faire beliefs of the open source community. Beyond the general differences between Java and C#, they also target two different environments. Java is developed to be environment independent and only requires that users have a Java virtual machine running in order for the java code to execute. C# on the other hand typically targets any system running a windows operating system and has the appropriate version of the .Net Framework installed. However, recently there has been push by a segment of the Open source community based around the Mono project that lets C# code run on other non-windows operating systems. In addition, another feature of C# is that it compiles into an intermediate language, and this is what is executed when the program runs. Because C# is reduced down to an intermediate language called Common Language Runtime (CLR) it can be combined with other languages that are also compiled in to the CLR like Visual Basic (VB) .Net, and F#. The allowance and interaction between multiple languages in the .Net Framework enables projects to utilize existing code bases regardless of the actual syntax because they can be compiled in to CLR and executed as one codebase. As a software engineer I personally feel that it is really important to learn as many languages as you can or at least be open to learn as many languages as you can because no one language will work in every situation. In some cases Java may be a better choice for a project and others may be C#. It really depends on the requirements of a project and the time constraints. In addition, I feel that is really important to concentrate on understanding the logic of programming and be able to translate business requirements into technical requirements. If you can understand both programming logic and business requirements then deciding which language to use is just basically choosing what syntax to write for a given business problem or need. In regards to code refactoring and dynamic languages it really does not matter. Eventually all projects will be refactored or decommissioned to allow for progress. This is the way of life in the software development industry. The language of a project should not be chosen based on the fact that a project will eventually be refactored because they all will get refactored.

Read the article
Essbase BSO Data Fragmentation

- by Ann Donahue

Essbase BSO Data Fragmentation Data fragmentation naturally occurs in Essbase Block Storage (BSO) databases where there are a lot of end user data updates, incremental data loads, many lock and send, and/or many calculations executed. If an Essbase database starts to experience performance slow-downs, this is an indication that there may be too much fragmentation. See Chapter 54 Improving Essbase Performance in the Essbase DBA Guide for more details on measuring and eliminating fragmentation: http://docs.oracle.com/cd/E17236_01/epm.1112/esb_dbag/daprcset.html Fragmentation is likely to occur in the following situations: Read/write databases that users are constantly updating data Databases that execute calculations around the clock Databases that frequently update and recalculate dense members Data loads that are poorly designed Databases that contain a significant number of Dynamic Calc and Store members Databases that use an isolation level of uncommitted access with commit block set to zero There are two types of data block fragmentation Free space tracking, which is measured using the Average Fragmentation Quotient statistic. Block order on disk, which is measured using the Average Cluster Ratio statistic. Average Fragmentation Quotient The Average Fragmentation Quotient ratio measures free space in a given database. As you update and calculate data, empty spaces occur when a block can no longer fit in its original space and will either append at the end of the file or fit in another empty space that is large enough. These empty spaces take up space in the .PAG files. The higher the number the more empty spaces you have, therefore, the bigger the .PAG file and the longer it takes to traverse through the .PAG file to get to a particular record. An Average Fragmentation Quotient value of 3.174765 means the database is 3% fragmented with free space. Average Cluster Ratio Average Cluster Ratio describes the order the blocks actually exist in the database. An Average Cluster Ratio number of 1 means all the blocks are ordered in the correct sequence in the order of the Outline. As you load data and calculate data blocks, the sequence can start to be out of order. This is because when you write to a block it may not be able to place back in the exact same spot in the database that it existed before. The lower this number the more out of order it becomes and the more it affects performance. An Average Cluster Ratio value of 1 means no fragmentation. Any value lower than 1 i.e. 0.01032828 means the data blocks are getting further out of order from the outline order. Eliminating Data Block Fragmentation Both types of data block fragmentation can be removed by doing a dense restructure or export/clear/import of the data. There are two types of dense restructure: 1. Implicit Restructures Implicit dense restructure happens when outline changes are done using EAS Outline Editor or Dimension Build. Essbase restructures create new .PAG files restructuring the data blocks in the .PAG files. When Essbase restructures the data blocks, it regenerates the index automatically so that index entries point to the new data blocks. Empty blocks are NOT removed with implicit restructures. 2. Explicit Restructures Explicit dense restructure happens when a manual initiation of the database restructure is executed. An explicit dense restructure is a full restructure which comprises of a dense restructure as outlined above plus the removal of empty blocks Empty Blocks vs. Fragmentation The existence of empty blocks is not considered fragmentation. Empty blocks can be created through calc scripts or formulas. An empty block will add to an existing database block count and will be included in the block counts of the database properties. There are no statistics for empty blocks. The only way to determine if empty blocks exist in an Essbase database is to record your current block count, export the entire database, clear the database then import the exported data. If the block count decreased, the difference is the number of empty blocks that had existed in the database.

Read the article
Would this be a good web application architecture?

- by Gustav Bertram

My problem Our MVC based framework does not allow us to cache only part of our output. Ideally we want to cahce static and semi-static bits, and run dynamic bits. In addition, we need to consider data caching that reacts to database changes. My idea The concept I came up with was to represent a page as a tree of XML fragment objects. (I say XML, but I mean XHTML). Some of the fragments are dynamic, and can pull their data directly from models or other sources, but most of the fragments are static scaffolding. If a subtree of fragments is completely static, then I imagine that they could unfold into pure XML that would then be cached as the text representation of their parent element. This process would ideally continue until we are left with a root element that contains all of the static XML, and has a couple of dynamic XML fragments that are resolved and attached to the relevant nodes of the XML tree just before the page is displayed. In addition to separating content into dynamic and static fragments, some fragments could be dynamic and cached. A simple expiry time which propagates up through the XML fragment tree would indicate that a specific fragment should periodically be refreshed. A newspaper section or front page does not need to be updated each second. Minutes or sometimes even longer is sufficient. Other fragments would be dynamic and uncached. Typically too many articles are viewed for them to be cached - the cache would overflow. Some individual articles may be cached if they are extremely popular. Functional notes The folding mechanism could be to be smart enough to judge when it would be more profitable to fold a dynamic cached fragment and propagate the expiry date to the parent fragment, or to keep it separate and simple attach to the XML tree when resolving the page. If some dynamic cached fragments are associated to database objects through mechanisms like a globally unique content id, then changes to the database could trigger changes to the output cache. If fragments store the identifiers of parent fragments, then they could trigger a refolding process that would then include the updated data. A set of pure XML with an ordered array of fragment objects (that each store the identifying information of the node to which they should be attached), can be resolved in a fairly simple way by walking the XML tree, and merging the data from the fragments. Because it is not necessary to parse and construct the entire tree in memory before attaching nodes, processing should be fairly fast. The identifiers of each fragment would be a combination of relevant identity data and the type of fragment object. Cached parent fragments would contain references to these identifiers, in order to then either pull them from the fragment cache, or to run their code. The controller's responsibility is reduced to making changes to the database, and telling the root XML fragment object to render itself. The Question My question has two parts: Is this a good design? Are there any obvious flaws I'm missing? Has somebody else thought of this before? References? Is there an existing alternative that I should consider? A cool templating engine maybe?

Read the article
ARTS Reference Model for Retail

- by Sanjeev Sharma

Consider a hypothetical scenario where you have been tasked to set up retail operations for a electronic goods or daily consumables or a luxury brand etc. It is very likely you will be faced with the following questions: What are the essential business capabilities that you must have in place? What are the essential business activities under-pinning each of the business capabilities, identified in Step 1? What are the set of steps that you need to perform to execute each of the business activities, identified in Step 2? Answers to the above will drive your investments in software and hardware to enable the core retail operations. More importantly, the choices you make in responding to the above questions will several implications in the short-run and in the long-run. In the short-term, you will incur the time and cost of defining your technology requirements, procuring the software/hardware components and getting them up and running. In the long-term, as you grow in operations organically or through M&A, partnerships and franchiser business models you will invariably need to make more technology investments to manage the greater complexity (scale and scope) of business operations. "As new software applications, such as time & attendance, labor scheduling, and POS transactions, just to mention a few, are introduced into the store environment, it takes a disproportionate amount of time and effort to integrate them with existing store applications. These integration projects can add up to 50 percent to the time needed to implement a new software application and contribute significantly to the cost of the overall project, particularly if a systems integrator is called in. This has been the reality that all retailers have had to live with over the last two decades. The effect of the environment has not only been to increase costs, but also to limit retailers' ability to implement change and the speed with which they can do so." (excerpt taken from here) Now, one would think a lot of retailers would have already gone through the pain of finding answers to these questions, so why re-invent the wheel? Precisely so, a major effort began almost 17 years ago in the retail industry to make it less expensive and less difficult to deploy new technology in stores and at the retail enterprise level. This effort is called the Association for Retail Technology Standards (ARTS). Without standards such as those defined by ARTS, you would very likely end up experiencing the following: Increased Time and Cost due to resource wastage arising from re-inventing the wheel i.e. re-creating vanilla processes from scratch, and incurring, otherwise avoidable, mistakes and errors by ignoring experience of others Sub-optimal Process Efficiency due to narrow, isolated view of processes thereby ignoring process inter-dependencies i.e. optimizing parts but not the whole, and resulting in lack of transparency and inter-departmental finger-pointing Embracing ARTS standards as a blue-print for establishing or managing or streamlining your retail operations can benefit you in the following ways: Improved Time-to-Market from parity with industry best-practice processes e.g. ARTS, thus avoiding “reinventing the wheel” for common retail processes and focusing more on customizing processes for differentiations, and lowering integration complexity and risk with a standardized vocabulary for exchange between internal and external i.e. partner systems Lower Operating Costs by embracing the ARTS enterprise-wide process reference model for developing and streamlining retail operations holistically instead of a narrow, silo-ed view, and procuring IT systems in compliance with ARTS thus avoiding IT budget marginalization While parity with industry standards such as ARTS business process model by itself does not create a differentiation, it does however provide a higher starting point for bridging the strategy-execution gap in setting up and improving retail operations.

Read the article
Documentation and Test Assertions in Databases

- by Phil Factor

When I first worked with Sybase/SQL Server, we thought our databases were impressively large but they were, by today’s standards, pathetically small. We had one script to build the whole database. Every script I ever read was richly annotated; it was more like reading a document. Every table had a comment block, and every line would be commented too. At the end of each routine (e.g. procedure) was a quick integration test, or series of test assertions, to check that nothing in the build was broken. We simply ran the build script, stored in the Version Control System, and it pulled everything together in a logical sequence that not only created the database objects but pulled in the static data. This worked fine at the scale we had. The advantage was that one could, by reading the source code, reach a rapid understanding of how the database worked and how one could interface with it. The problem was that it was a system that meant that only one developer at the time could work on the database. It was very easy for a developer to execute accidentally the entire build script rather than the selected section on which he or she was working, thereby cleansing the database of everyone else’s work-in-progress and data. It soon became the fashion to work at the object level, so that programmers could check out individual views, tables, functions, constraints and rules and work on them independently. It was then that I noticed the trend to generate the source for the VCS retrospectively from the development server. Tables were worst affected. You can, of course, add or delete a table’s columns and constraints retrospectively, which means that the existing source no longer represents the current object. If, after your development work, you generate the source from the live table, then you get no block or line comments, and the source script is sprinkled with silly square-brackets and other confetti, thereby rendering it visually indigestible. Routines, too, were affected. In our system, every routine had a directly attached string of unit-tests. A retro-generated routine has no unit-tests or test assertions. Yes, one can still commit our test code to the VCS but it’s a separate module and teams end up running the whole suite of tests for every individual change, rather than just the tests for that routine, which doesn’t scale for database testing. With Extended properties, one can get the best of both worlds, and even use them to put blame, praise or annotations into your VCS. It requires a lot of work, though, particularly the script to generate the table. The problem is that there are no conventional names beyond ‘MS_Description’ for the special use of extended properties. This makes it difficult to do splendid things such ensuring the integrity of the build by running a suite of tests that are actually stored in extended properties within the database and therefore the VCS. We have lost the readability of database source code over the years, and largely jettisoned the use of test assertions as part of the database build. This is not unexpected in view of the increasing complexity of the structure of databases and number of programmers working on them. There must, surely, be a way of getting them back, but I sometimes wonder if I’m one of very few who miss them.

Read the article
Ubuntu 12.04 LXC nat prerouting not working

- by petermolnar

I have a running Debian Wheezy setup I copied exactly to an Ubuntu 12.04 ( elementary OS, used as desktop as well ) While the Debian setup runs flawlessly, the Ubuntu version dies on the prerouting to containers ( or so it seems ) In short: lxc works containers work and run connecting to container from host OK ( including mixed ports & services ) connecting to outside world from container is fine What does not work is connecting from another box to the host on a port that should be NATed to a container. The setups: /etc/rc.local CMD_BRCTL=/sbin/brctl CMD_IFCONFIG=/sbin/ifconfig CMD_IPTABLES=/sbin/iptables CMD_ROUTE=/sbin/route NETWORK_BRIDGE_DEVICE_NAT=lxc-bridge HOST_NETDEVICE=eth0 PRIVATE_GW_NAT=192.168.42.1 PRIVATE_NETMASK=255.255.255.0 PUBLIC_IP=192.168.13.100 ${CMD_BRCTL} addbr ${NETWORK_BRIDGE_DEVICE_NAT} ${CMD_BRCTL} setfd ${NETWORK_BRIDGE_DEVICE_NAT} 0 ${CMD_IFCONFIG} ${NETWORK_BRIDGE_DEVICE_NAT} ${PRIVATE_GW_NAT} netmask ${PRIVATE_NETMASK} promisc up Therefore lxc network is 192.168.42.0/24 and the host eth0 ip is 192.168.13.100; setup via network manager as static address. iptables: *mangle :PREROUTING ACCEPT [0:0] :INPUT ACCEPT [0:0] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [0:0] :POSTROUTING ACCEPT [0:0] COMMIT *filter :FORWARD ACCEPT [0:0] :INPUT DROP [0:0] :OUTPUT ACCEPT [0:0] # Accept traffic from internal interfaces -A INPUT -i lo -j ACCEPT # accept traffic from lxc network -A INPUT -d 192.168.42.1 -s 192.168.42.0/24 -j ACCEPT # Accept internal traffic Make sure NEW incoming tcp connections are SYN # packets; otherwise we need to drop them: -A INPUT -p tcp ! --syn -m state --state NEW -j DROP # Packets with incoming fragments drop them. This attack result into Linux server panic such data loss. -A INPUT -f -j DROP # Incoming malformed XMAS packets drop them: -A INPUT -p tcp --tcp-flags ALL ALL -j DROP # Incoming malformed NULL packets: -A INPUT -p tcp --tcp-flags ALL NONE -j DROP # Accept traffic with the ACK flag set -A INPUT -p tcp -m tcp --tcp-flags ACK ACK -j ACCEPT # Allow incoming data that is part of a connection we established -A INPUT -m state --state ESTABLISHED -j ACCEPT # Allow data that is related to existing connections -A INPUT -m state --state RELATED -j ACCEPT # Accept responses to DNS queries -A INPUT -p udp -m udp --dport 1024:65535 --sport 53 -j ACCEPT # Accept responses to our pings -A INPUT -p icmp -m icmp --icmp-type echo-reply -j ACCEPT # Accept notifications of unreachable hosts -A INPUT -p icmp -m icmp --icmp-type destination-unreachable -j ACCEPT # Accept notifications to reduce sending speed -A INPUT -p icmp -m icmp --icmp-type source-quench -j ACCEPT # Accept notifications of lost packets -A INPUT -p icmp -m icmp --icmp-type time-exceeded -j ACCEPT # Accept notifications of protocol problems -A INPUT -p icmp -m icmp --icmp-type parameter-problem -j ACCEPT # Respond to pings, but limit -A INPUT -m icmp -p icmp --icmp-type echo-request -m state --state NEW -m limit --limit 6/s -j ACCEPT # Allow connections to SSH server -A INPUT -p tcp -m tcp --dport 22 -m state --state NEW -m limit --limit 12/s -j ACCEPT COMMIT *nat :OUTPUT ACCEPT [0:0] :PREROUTING ACCEPT [0:0] :POSTROUTING ACCEPT [0:0] -A PREROUTING -d 192.168.13.100 -p tcp -m tcp --dport 2221 -m state --state NEW -m limit --limit 12/s -j DNAT --to-destination 192.168.42.11:22 -A PREROUTING -d 192.168.13.100 -p tcp -m tcp --dport 80 -m state --state NEW -m limit --limit 512/s -j DNAT --to-destination 192.168.42.11:80 -A PREROUTING -d 192.168.13.100 -p tcp -m tcp --dport 443 -m state --state NEW -m limit --limit 512/s -j DNAT --to-destination 192.168.42.11:443 -A POSTROUTING -d 192.168.42.0/24 -o eth0 -j SNAT --to-source 192.168.13.100 -A POSTROUTING -o eth0 -j MASQUERADE COMMIT sysctl: net.ipv4.conf.all.forwarding = 1 net.ipv4.conf.all.mc_forwarding = 0 net.ipv4.conf.default.forwarding = 1 net.ipv4.conf.default.mc_forwarding = 0 net.ipv4.ip_forward = 1 I've set up full iptables log on the container; none of the packets addressed to 192.168.13.100, port 80 is reaching the container. I've even tried different kernels ( server kernel, raring lts kernel, etc ), modprobe everything iptables & nat related, nothing. Any ideas?

Read the article
PowerShell and SMO – be careful how you iterate

- by Fatherjack

I’ve yet to have a totally smooth experience with PowerShell and it was late on Friday when I crashed into this problem. I haven’t investigated if this is a generally well understood circumstance and if it is then I apologise for repeating everything. Scenario: I wanted to scan a number of server for many properties, including existing logins and to identify which accounts are bestowed with sysadmin privileges. A great task to pass to PowerShell, so with a heavy heart I started up PowerShellISE and started typing. The script doesn’t come easily to me but I follow the logic of SMO and the properties and methods available with the language so it seemed something I should be able to master. Version #1 of my script. And the results it returns when executed against my home laptop server. These results looked good and for a long time I was concerned with other parts of the script, for all intents and purposes quite happy that this was an accurate assessment of the server. Let’s just review my logic for each step of the code at the top. Lines 1 to 7 just set up our variables and write out the header message Line 8 our first loop, to go through each login on the server Line 10 an inner loop that will assess each role name that each login has been assigned Line 11 a test to see if each role has the name ‘sysadmin’ Line 13 write out the login name with a bright format as it is a sysadmin login Line 17 write out the login name with no formatting It is quite possible that here someone with more PowerShell experience than me will be shouting at their screen pointing at the error I made but to me this made total sense. Until I altered the code, I altered lines 6 and 7 of code above to be: $c = $Svr.Logins.Count write-host “There are $c Logins on the server” This changed my output to look like this: This started alarm bells ringing – there are clearly not 13 logins listed So, let’s see where things are going wrong, edit the script so it looks like this. I’ve highlighted the changes to make Running this code shows me these results Our $n variable should count up by one for each login returned and We are clearly missing some logins. I referenced this list back to Management Studio for my server and see the Logins as below, where there are clearly 13 logins. We see a Login called Annette in SSMS but not in the script results so I opened that up and looked at its properties and it’s server roles in particular. The account has only public access to the server. Inspection of the other logins that the PowerShell script misses out show they too are only members of the public role. Right now I can’t work out whether there is a good reason for this and if it should be expected behaviour or not. Please spend a few minutes to leave a comment if you have an opinion or theory for this. How to get the full list of logins. Clearly I needed to get a full list of the logins so set about reviewing my code to see if there was a better way to iterate through the roles for each login. This is the code that I came up with and I think it is doing everything that I need it to. It gives me the expected results like this: So it seems that the ListMembers() method is the trouble maker in my first versions of the code. I would have expected that ListMembers should return Logins that are only members of the public role, certainly Technet makes no reference to it being left out in it’s Login.ListMembers details. Suffice to say, it’s a lesson learned and I will approach using it with caution in future circumstances.

Read the article
Introduction to WebCenter Personalization: “The Conductor”

- by Steve Pepper

There are some new faces in the town of WebCenter with the latest 11g PS3 release. A new component has introduced itself as "Oracle WebCenter Personalization", a.k.a WCP, to simplify delivery of a personalized experience and content to end users. This posting reviews one of the primary components within WCP: "The Conductor". The Conductor: This ain't just an ordinary cloud... One of the founding principals behind WebCenter Personalization was to provide an open client-side API that remains independent of the technology invoking it, in addition to independence from the architecture running it. The Conductor delivers this, and much, much more. The Conductor is the engine behind WebCenter Personalization that allows flow-based documents, called "Scenarios", to be managed and executed on the server-side through a well published and RESTful api. The Conductor also supports an extensible model for custom provider integration that can be easily invoked within a Scenario to promote seamless integration with existing business assets. Introducing the Scenario Conductor Scenarios are declarative offline-authored documents using the custom Personalization JDeveloper bundle included with WebCenter. A Scenario contains one (or more) statements that can: Create variables that are scoped to the current execution context Iterate over collections, or loop until a specific condition is met Execute one or more statements when a condition is met Invoke other scenarios that exist within the same namespace Invoke a data provider that integrates with custom applications Once a variable is assigned within the Scenario's execution context, it can be referenced anywhere within the same Scenario using the common Expression Language syntax used in J2EE web containers. Scenarios are then published and tested to the Integrated WebLogic Server domain, or published remotely to other domains running WebCenter Personalization. Various Client-side Models The Conductor server API is built upon RESTful services that support a wide variety of clients able to communicate over HTTP. The Conductor supports the following client-side models: REST: Popular browser-based languages can be used to manage and execute Conductor Scenarios. There are other public methods to retrieve configured provider metadata that can be used by custom applications. The Conductor currently supports XML and JSON for it's API syntax. Java: WebCenter Personalization delivers a robust and light-weight java client with the popular Jersey framework as it's foundation. It has never been easier to write a remote java client to manage remote RESTful services. Expression Language (EL): Allow the results of Scenario execution to control your user interface or embed personalized content using the session-scoped managed bean. The EL client can also be used in straight JSP pages with minimal configuration. Extensible Provider Framework The Conductor supports a pluggable provider framework for integrating custom code with Scenario execution. There are two types of providers supported by the Conductor: Function Provider: Function Providers are simple java annotated classes with static methods that are meant to be served as utilities. Some common uses would include: object creation or instantiation, data transformation, and the like. Function Providers can be invoked using the common EL syntax from variable assignments, conditions, and loops. For example: ${myUtilityClass:doStuff(arg1,arg2))} If you are familiar with EL Functions, Function Providers are based on the same concept. Data Provider: Like Function Providers, Data Providers are annotated java classes, but they must adhere to a much more strict object model. Data Providers have access to a wealth of Conductor services, such as: Access to namespace-scoped configuration API that can be managed by Oracle Enterprise Manager, Scenario execution context for expression resolution, and more. Oracle ships with three out-of-the-box data providers that supports integration with: Standardized Content Servers(CMIS), Federated Profile Properties through the Properties Service, and WebCenter Activity Graph. Useful References If you are looking to immediately get started writing your own application using WebCenter Personalization Services, you will find the following references helpful in getting you on your way: Personalizing WebCenter Applications Authoring Personalized Scenarios in JDeveloper Using Personalization APIs Externally Implementing and Calling Function Providers Implementing and Calling Data Providers

Read the article
SQL to select random mix of rows fairly [migrated]

- by Matt Sieker

Here's my problem: I have a set of tables in a database populated with data from a client that contains product information. In addition to the basic product information, there is also information about the manufacturer, and categories for those products (a product can be in one or more categories). These categories are then referred to as "Product Categories", and which stores these products are available at. These tables are updated once a week from a feed from the customer. Since for our purposes, some of the product categories are the same, or closely related for our purposes, there is another level of categories called "General Categories", a general category can have one or more product categories. For the scope of these tables, here's some rough numbers: Data Tables: Products: 475,000 Manufacturers: 1300 Stores: 150 General Categories: 245 Product Categories: 500 Mapping Tables: Product Category -> Product: 655,000 Stores -> Products: 50,000,000 Now, for the actual problem: As part of our software, we need to select n random products, given a store and a general category. However, we also need to ensure a good mix of manufacturers, as in some categories, a single manufacturer dominates the results, and selecting rows at random causes the results to strongly favor that manufacturer. The solution that is currently in place, works for most cases, involves selecting all of the rows that match the store and category criteria, partition them on manufacturer, and include their row number from within their partition, then select from that where the row number for that manufacturer is less than n, and use ROWCOUNT to clamp the total rows returned to n. This query looks something like this: SET ROWCOUNT 6 select p.Id, GeneralCategory_Id, Product_Id, ISNULL(m.DisplayName, m.Name) AS Vendor, MSRP, MemberPrice, FamilyImageName from (select p.Id, gc.Id GeneralCategory_Id, p.Id Product_Id, ctp.Store_id, Manufacturer_id, ROW_NUMBER() OVER (PARTITION BY Manufacturer_id ORDER BY NEWID()) AS 'VendorOrder', MSRP, MemberPrice, FamilyImageName from GeneralCategory gc inner join GeneralCategoriesToProductCategories gctpc ON gc.Id=gctpc.GeneralCategory_Id inner join ProductCategoryToProduct pctp on gctpc.ProductCategory_Id = pctp.ProductCategory_Id inner join Product p on p.Id = pctp.Product_Id inner join StoreToProduct ctp on p.Id = ctp.Product_id where gc.Id = @GeneralCategory and ctp.Store_id=@StoreId and p.Active=1 and p.MemberPrice >0) p inner join Manufacturer m on m.Id = p.Manufacturer_id where VendorOrder <=6 order by NEWID() SET ROWCOUNT 0 (I've tried to somewhat format it to make it cleaner, but I don't think it really helps) Running this query with an execution plan shows that for the majority of these tables, it's doing a Clustered Index Seek. There are two operations that take up roughly 90% of the time: Index Seek (Nonclustered) on StoreToProduct: 17%. This table just contains the key of the store, and the key of the product. It seems that NHibernate decided not to make a composite key when making this table, but I'm not concerned about this at this point, as compared to the other seek... Clustered Index Seek on Product: 69%. I really have no clue how I could make this one more performant. On categories without a lot of products, performance is acceptable (<50ms), however larger categories can take a few hundred ms, with the largest category taking 3s (which has about 170k products). It seems I have two ways to go from this point: Somehow optimize the existing query and table indices to lower the query time. As almost every expensive operation is already a clustered index scan, I don't know what could be done there. The inner query could be tuned to not return all of the possible rows for that category, but I am unsure how to do this, and maintain the requirements (random products, with a good mix of manufacturers) Denormalize this data for the purpose of this query when doing the once a week import. However, I am unsure how to do this and maintain the requirements. Does anyone have any input on either of these items?

Read the article
New Version 3.1 Endeca Information Discovery Now Available

- by Mike.Hallett(at)Oracle-BI&EPM

Normal 0 false false false EN-GB X-NONE X-NONE MicrosoftInternetExplorer4 Business User Self-Service Data Mash-up Analysis and Discovery integrated with OBI11g and Hadoop Oracle Endeca Information Discovery 3.1 (OEID) is a major release that incorporates significant new self-service discovery capabilities for business users, including agile data mashup, extended support for unstructured analytics, and an even tighter integration with Oracle BI. · Self-Service Data Mashup and Discovery Dashboards: business users can combine information from multiple sources, including their own up-loaded spreadsheets, to conduct analysis on the complete set. Creating discovery dashboards has been made even easier by intuitive drag-and drop layouts and wizard-based configuration. Business users can now build new discovery applications in minutes, without depending on IT. · Enhanced Integration with Oracle BI: OEID 3.1 enhances its’ native integration with Oracle Business Intelligence Foundation. Business users can now incorporate information from trusted BI warehouses, leveraging dimensions and attributes defined in Oracle’s Common Enterprise Information Model, but evolve them based on the varying day-to-day demands and requirements that they personally manage. · Deep Unstructured Analysis: business users can gain new insights from a wide variety of enterprise and public sources, helping companies to build an actionable Big Data strategy. With OEID’s long-standing differentiation in correlating unstructured information with structured data, business users can now perform their own text mining to identify hidden concepts, without having to request support from IT. They can augment these insights with best in class keyword search and pattern matching, all in the context of rich, interactive visualizations and analytic summaries. · Enterprise-Class Self-Service Discovery: OEID 3.1 enables IT to provide a powerful self-service platform to the business as part of a broader Business Analytics strategy, preserving the value of existing investments in data quality, governance, and security. Business users can take advantage of IT-curated information to drive discovery across high volumes and varieties of data, and share insights with colleagues at a moment’s notice. · Harvest Content from the Web with the Endeca Web Acquisition Toolkit: Oracle now provides best-of-breed data access to website content through the Oracle Endeca Web Acquisition Toolkit. This provides an agile, graphical interface for developers to rapidly access and integrate any information exposed through a web front-end. Organizations can now cost-effectively include content from consumer sites, industry forums, government or supplier portals, cloud applications, and myriad other web sources as part of their overall strategy for data discovery and unstructured analytics. For more information: OEID 3.1 OTN Software and Documentation Download And Endeca available for download on Software Delivery Cloud (eDelivery) New OEID 3.1 Videos on YouTube Oracle.com Endeca Site /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0cm 5.4pt 0cm 5.4pt; mso-para-margin-top:0cm; mso-para-margin-right:0cm; mso-para-margin-bottom:10.0pt; mso-para-margin-left:0cm; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi; mso-fareast-language:EN-US;}

Read the article
Questions to ask to ensure someone understands programming? (and iOS)

- by Stephen J

So, I've been tutoring my friend for 2 years. Most people learn programming on their own in 3-6 months, (sans algorithms). It's confusing 'cause he'll run anywhere I tell him to, understands how to read C and C++ honestly better than the average college student, and he'll modify and repeat anything I do... but for the love of god he doesn't move on to new things and he still has test anxiety. I've recently realized he's copied and toyed with existing, but not once gained an understanding of why. I was under the impression he was learning fast because he could write it, but when you say "Make a function that takes an NSString" and he says "How?" and I say "The same way you make ANY function that takes any parameter, NSString is just a type like int" and all I hear is "No, it's an NSString, it's a special thing." and we get into an arguing match 'cause I'm like "It's just a class like any other class, you've used them for months now" and blah... I've subconsciously avoided comprehension questions because of this. Anyway, if you have him copy a program and say "Just initialize it" "Where?" "I don't care, didLoad or initWithCoder or Awake from nib, anywhere it gets initialized" and "No, it has to be exactly where you had it!" "No it doesn't!" I'm sick of this, but he won't give up. So I'm done avoiding these yelling matches and becoming a sadist from now on. I would like some help in finding questions to ask him that force him to understand what he's doing. I'd like some help and any resources I can find. CQuestions looked like a good site, but now I need some iPhone stuff. For example: *What do properties do? How are they changed? How do you change the name of the getter? *Why are Booleans inefficent? What advantage does int have over a boolean and how does the bit-shift operator help? *What does Copy do to a string? *What's the difference between a view controller and a uiview? *Write a program from memory that displays blah on screen, and flashes each view one by one. From beginner up to intermediate, hobbyist with some algebra at most. I'm just looking for resources to work with. I left in backstory so you know to "twist" the questions so he doesn't know he's supposed to init a variable here or there, but has to figure it out, and learn why it goes "here" or that "anywhere is fine as long as it's". Sample programs, anything. I'm relatively open about this because, being a programmer, I seriously doubt he's the only one who has this issue. I'd like to know how others have overcome similar. What made things "click"? for you? Did you have a hard time finding answers on Google, and how did you learn a better way to find what you were looking for? (He's so exact, he'll search for how to write a checkers program with color X and Y inside a uiview, as his search string, instead of breaking it up into components, I need help with that too, and believe it is related). This type of problem has to remind one of us of someone they know. So, Exercises to force them to think? Ways we overcame this thing in the past? I greatly appreciate any help.

Read the article
Experimenting with other search engines

- by Bill Graziano

I’ve been a Google user so long I can hardly remember what I used before it. Alta Vista maybe? Or Yahoo. I’ve tried Bing off and on but it never really stuck. I probably care more about search engines than your average user because of their impact on SQLTeam.com. Lately I’ve been trying two other search engines and actually switched to one of them. I’ve played with Blekko a little in the past. They have some interesting ways to “slice up” your results. For example, searching on “SQL Server /blogs /date” should just search all the recently updated blogs. Those two extra words on the search are slashtags. The full list of slashtags runs from /forums to just see forums to /twitter to /nikon to /reviews and on and on and on. I laughed when I saw they had slashtags for both liberal and conservative. I’d hate to find any search results that don’t match my existing worldview :) You can also create your own slashtags. I created a mini-search engine for the SQL Server blogs that I read. You can search it for “backup” at http://blekko.com/ws/backup+/billgraziano/sql-sites. I uploaded my OPML and it limited the search to just those sites. It seems like the site is focusing more on curating results and less on algorithms. This is an interesting site for those power searchers. There are some great ways to curate results using slashtags. For 99% of my searches (type words, click on one of the first few links) slashtags are overkill. They do have some good information on page and site ranking though so I’ll probably send some time looking through that. Blekko recently got my attention again when they said they were banning “content farms” - and that includes eHow and experts-exchange. I always feel used when I click on a link to EE and find myself scrolling all the way to the bottom to see if I can find the answer. Sometimes it’s there but sometimes it tells me I need to pay first. I’ve longed for a way to always exclude certain sites. Blekko might be taking a hammer to a problem that needs a scalpel but it’s an interesting choice. (And some of the comments in the TechCrunch link are interesting if you’re a search nerd.) DuckDuckGo is an odd name for a search engine. Their big hook is that they don’t have search history. If you wade through your Google account you can probably find the page where it stores your search history. It was pretty enlightening to find mine. It was easy to disable but that got me started looking at other search engines. DDG (or DukGo) just feels like Google used to in the old days. The results are good enough and the site is fast. Searches will return a snippet from WikiPedia or other site (like StackOverflow) at the top. I think the idea is to answer the question without needing to visit the site. I’m not sure that’s a good thing for SQLTeam.com. The only thing I really miss is image search. You can add a “!i” at the end of any search and it will search the images on Bing. Bing doesn’t have a great image search but it works for most of what I need. They call these exclamation marks “!bangs” and they are kinda, sorta like slashtags. I’ve been using DuckDuckGo now for a few weeks and I’m pretty happy with it. I use Chrome for my browser and it was an easy switch to make. It’s still a little surprising seeing my search results come up in a different format. I’m starting to get used to it though.

Read the article
Making it GREAT! Oracle Partners Building Apps Workshop with UX and ADF in UK

- by ultan o'broin

Yes, making is what it's all about. This time, Oracle Partners in the UK were making great looking usable apps with the Oracle Applications Development Framework (ADF) and user experience (UX) toolkit. And what an energy-packed and productive event at the Oracle UK, Thames Valley Park, location it was. Partners learned the fundamentals of enterprise applications UX, why it's important, all about visual design, how to wireframe designs, and then how to build their already-proven designs in ADF. There was a whole day on mobile apps, learning about mobile design principles, free mobile UX and ADF resources from Oracle, and then trying it out. The workshop wrapped up with the latest Release 7 simplified UIs, Mobilytics, and other innovations from Oracle, and a live demo of a very neat ADF Mobile Android app built by an Oracle contractor. And, what a fun two days both Grant Ronald of ADF and myself had in running the workshop with such a great audience, too! I particularly enjoyed the wireframing and visual design sessions interaction; and seeing some outstanding work done by partners. Of note from the UK workshop were innovative design features not seen before and made me all the happier that developers were bringing their own ideas from the consumer IT world of mobility, simplicity, and social to the world of work apps in a smart way within an enterprise methodology too. Partner wireframe exercise. Applying mobile design principles and UX design patterns means you've already productively making great usable apps! Next, over to Oracle ADF Mobile with it! One simple example from the design of a mobile field service app was that participants immediately saw how the UX and device functionality of the super UK-based app Hailo app could influence their designs (the London cabbie influence maybe?), as well as how we all use maps, cameras, barcode scanners and microphones on our phones could be used in work. And, of course, ADF Mobile has the device integration solutions there too! I wonder will U.S. workshops in Silicon Valley see an Uber UX influence (LOL)! That we also had partners experienced with Oracle Forms who could now offer a roadmap from Forms to Simplified UI and Mobile using ADF, and do it through through the cloud, really made this particular workshop go "ZING!" for me. Many thanks to the Oracle PartnerNetwork (OPN) team for organizing this event with us, and to the representatives of the Oracle Partners that showed and participated so well. That's what I love out this outreach. It's a two-way, solid value-add for all. Interested? Why would partners and developers with ADF skills sign up for this workshop? Here's why: Learn to use the Oracle Applications User Experience design patterns as the usability building blocks for applications development in Oracle Application Development Framework. The workshop enables attendees to build modern and visually compelling desktop and mobile applications that look and behave like Oracle Cloud Applications, and that can co-exist with partner integrations, new, or existing applications deployments. Partners learn to offer customers and clients more than just coded functionality; instead they can provide a complete user experience with a roadmap for continued ROI from applications that also creating more business and attracts the kudos and respect from other makers of apps as they're wowed by the results. So, if you're a partner and interested in attending one of these workshops and benefitting from such learning, as well as having a platform to show off some of your own work, stay well tuned to your OPN channels, to this blog, to the VoX blog, and to the @usableapps Twitter account too. Can't wait? For developers and partners, some key mobile resources to explore now Oracle ADF Mobile UX Patterns and Components Wiki Oracle ADF Academy (Mobile) Oracle ADF Insider Essentials Oracle Applications Mobile User Experience Design Patterns and Guidance

Read the article
Investigating on xVelocity (VertiPaq) column size

- by Marco Russo (SQLBI)

In January I published an article about how to optimize high cardinality columns in VertiPaq. In the meantime, VertiPaq has been rebranded to xVelocity: the official name is now “xVelocity in-memory analytics engine (VertiPaq)” but using xVelocity and VertiPaq when we talk about Analysis Services has the same meaning. In this post I’ll show how to investigate on columns size of an existing Tabular database so that you can find the most important columns to be optimized. A first approach can be looking in the DataDir of Analysis Services and look for the folder containing the database. Then, look for the biggest files in all subfolders and you will find the name of a file that contains the name of the most expensive column. However, this heuristic process is not very optimized. A better approach is using a DMV that provides the exact information. For example, by using the following query (open SSMS, open an MDX query on the database you are interested to and execute it) you will see all database objects sorted by used size in a descending way. SELECT * FROM $SYSTEM.DISCOVER_STORAGE_TABLE_COLUMN_SEGMENTS ORDER BY used_size DESC You can look at the first rows in order to understand what are the most expensive columns in your tabular model. The interesting data provided are: TABLE_ID: it is the name of the object – it can be also a dictionary or an index COLUMN_ID: it is the column name the object belongs to – you can also see ID_TO_POS and POS_TO_ID in case they refer to internal indexes RECORDS_COUNT: it is the number of rows in the column USED_SIZE: it is the used memory for the object By looking at the ration between USED_SIZE and RECORDS_COUNT you can understand what you can do in order to optimize your tabular model. Your options are: Remove the column. Yes, if it contains data you will never use in a query, simply remove the column from the tabular model Change granularity. If you are tracking time and you included milliseconds but seconds would be enough, round the data source column to the nearest second. If you have a floating point number but two decimals are good enough (i.e. the temperature), round the number to the nearest decimal is relevant to you. Split the column. Create two or more columns that have to be combined together in order to produce the original value. This technique is described in VertiPaq optimization article. Sort the table by that column. When you read the data source, you might consider sorting data by this column, so that the compression will be more efficient. However, this technique works better on columns that don’t have too many distinct values and you will probably move the problem to another column. Sorting data starting from the lower density columns (those with a few number of distinct values) and going to higher density columns (those with high cardinality) is the technique that provides the best compression ratio. After the optimization you should be able to reduce the used size and improve the count/size ration you measured before. If you are interested in a longer discussion about internal storage in VertiPaq and you want understand why this approach can save you space (and time), you can attend my 24 Hours of PASS session “VertiPaq Under the Hood” on March 21 at 08:00 GMT.

Read the article

Search Results

Search found 9371 results on 375 pages for 'existing'.

Page 326/375 | < Previous Page | 322 323 324 325 326 327 328 329 330 331 332 333 | Next Page >

- by Mohit Phogat

- by kellsey.ruppel

- by aspiringgeek

- by kellsey.ruppel

- by Tony Davis

- by Mike Stiles

- by max

- by Marco Russo (SQLBI)

- by Simon Elliston Ball

- by atif089

- by Ann Donahue

- by Gustav Bertram

- by Sanjeev Sharma

- by Phil Factor

- by petermolnar

- by Fatherjack

- by Steve Pepper

- by Matt Sieker

- by Mike.Hallett(at)Oracle-BI&EPM

- by Stephen J

- by Bill Graziano

- by ultan o'broin

- by Marco Russo (SQLBI)

< Previous Page | 322 323 324 325 326 327 328 329 330 331 332 333 | Next Page >