Search Results

Search found 58436 results on 2338 pages for 'data'.

Page 466/2338 | < Previous Page | 462 463 464 465 466 467 468 469 470 471 472 473  | Next Page >

  • Analytics in an Omni-Channel World

    - by David Dorf
    Retail has been around ever since mankind started bartering.  The earliest transactions were very specific to the individuals buying and selling, then someone had the bright idea to open a store.  Those transactions were a little more generic, but the store owner still knew his customers and what they wanted.  As the chains rolled out, customer intimacy was sacrificed for scale, and retailers began to rely on segments and clusters.  But thanks to the widespread availability of data and the technology to convert said data into information, retailers are getting back to details. The retail industry is following a maturity model for analytics that is has progressed through five stages, each delivering more value than the previous. Store Analytics Brick-and-mortar retailers (and pure-play catalogers as well) that collect anonymous basket-level data are able to get some sense of demand to help with allocation decisions.  Promotions and foot-traffic can be measured to understand marketing effectiveness and perhaps focus groups can help test ideas.  But decisions are influenced by the majority, using faceless customer segments and aggregated industry data points.  Loyalty programs help a little, but in many cases the cost outweighs the benefits. Web Analytics The Web made it much easier to collect data on specific, yet still anonymous consumers using cookies to track visits. Clickstreams and product searches are analyzed to understand the purchase journey, gauge demand, and better understand up-selling opportunities.  Personalization begins to allow retailers target market consumers with recommendations. Cross-Channel Analytics This phase is a minor one, but where most retailers probably sit today.  They are able to use information from one channel to bolster activities in another. However, there are technical challenges combining data silos so its not an easy task.  But for those retailers that are able to perform analytics on both sources of data, the pay-off is pretty nice.  Revenue per customer begins to go up as customers have a better brand experience. Mobile & Social Analytics Big data technologies are enabling a 360-degree view of the customer by incorporating psychographic data from social sites alongside traditional demographic data.  Retailers can track individual preferences, opinions, hobbies, etc. in order to understand a consumer's motivations.  Using mobile devices, consumers can interact with brands anywhere, anytime, accessing deep product information and reviews.  Mobile, combined with a loyalty program, presents an opportunity to put shopping into geographic context, understanding paths to the store, patterns within the store, and be an always-on advertising conduit. Omni-Channel Analytics All this data along with the proper technology represents a new paradigm in which the clock is turned back and retail becomes very personal once again.  Rich, individualized data better illuminates demand, allows for highly localized assortments, and helps tailor up-selling.  Interactions with all channels help build an accurate profile of each consumer, and allows retailers to tailor the retail experience to meet the heightened expectations of today's sophisticated shopper.  And of course this culminates in greater customer satisfaction and business profitability.

    Read the article

  • Windows Azure Recipe: Consumer Portal

    - by Clint Edmonson
    Nearly every company on the internet has a web presence. Many are merely using theirs for informational purposes. More sophisticated portals allow customers to register their contact information and provide some level of interaction or customer support. But as our understanding of how consumers use the web increases, the more progressive companies are taking advantage of social web and rich media delivery to connect at a deeper level with the consumers of their goods and services. Drivers Cost reduction Scalability Global distribution Time to market Solution Here’s a sketch of how a Windows Azure Consumer Portal might be built out: Ingredients Web Role – this will host the core of the solution. Each web role is a virtual machine hosting an application written in ASP.NET (or optionally php, or node.js). The number of web roles can be scaled up or down as needed to handle peak and non-peak traffic loads. Database – every modern web application needs to store data. SQL Azure databases look and act exactly like their on-premise siblings but are fault tolerant and have data redundancy built in. Access Control (optional) – if identity needs to be tracked within the solution, the access control service combined with the Windows Identity Foundation framework provides out-of-the-box support for several social media platforms including Windows LiveID, Google, Yahoo!, Facebook. It also has a provider model to allow integration with other platforms as well. Caching (optional) – for sites with high traffic with lots of read-only data and lists, the distributed in-memory caching service can be used to cache and serve up static data at higher scale and speed than direct database requests. It can also be used to manage user session state. Blob Storage (optional) – for sites that serve up unstructured data such as documents, video, audio, device drivers, and more. The data is highly available and stored redundantly across data centers. Each entry in blob storage is provided with it’s own unique URL for direct access by the browser. Content Delivery Network (CDN) (optional) – for sites that service users around the globe, the CDN is an extension to blob storage that, when enabled, will automatically cache frequently accessed blobs and static site content at edge data centers around the world. The data can be delivered statically or streamed in the case of rich media content. Training Labs These links point to online Windows Azure training labs where you can learn more about the individual ingredients described above. (Note: The entire Windows Azure Training Kit can also be downloaded for offline use.) Windows Azure (16 labs) Windows Azure is an internet-scale cloud computing and services platform hosted in Microsoft data centers, which provides an operating system and a set of developer services which can be used individually or together. It gives developers the choice to build web applications; applications running on connected devices, PCs, or servers; or hybrid solutions offering the best of both worlds. New or enhanced applications can be built using existing skills with the Visual Studio development environment and the .NET Framework. With its standards-based and interoperable approach, the services platform supports multiple internet protocols, including HTTP, REST, SOAP, and plain XML SQL Azure (7 labs) Microsoft SQL Azure delivers on the Microsoft Data Platform vision of extending the SQL Server capabilities to the cloud as web-based services, enabling you to store structured, semi-structured, and unstructured data. Windows Azure Services (9 labs) As applications collaborate across organizational boundaries, ensuring secure transactions across disparate security domains is crucial but difficult to implement. Windows Azure Services provides hosted authentication and access control using powerful, secure, standards-based infrastructure. See my Windows Azure Resource Guide for more guidance on how to get started, including links web portals, training kits, samples, and blogs related to Windows Azure.

    Read the article

  • How to remove synaptic without installing all the unwanted packages?

    - by Jay
    I am trying to uninstall synaptic. I prefer using apt-get and other command line tools to manage my packages. So I do not need synaptic and the software manager. I'm trying to remove both of them using apt-get. Its a new box. Recently installed Linux Mint mate 15. After installation, the only thing I did was, sudo apt-get update and sudo apt-get dist-upgrade After that, I did this command for removing synaptic, sudo apt-get remove --purge synaptic But this gives me a very weird output, Reading package lists... Done Building dependency tree Reading state information... Done The following packages were automatically installed and are no longer required: apturl-kde icoutils kate-data katepart kde-runtime kde-runtime-data kdelibs-bin kdelibs5-data kdelibs5-plugins kdesudo kdoctools kubuntu-debug-installer libattica0.4 libdlrestrictions1 libkactivities-bin libkactivities-models1 libkactivities6 libkatepartinterfaces4 libkcmutils4 libkde3support4 libkdeclarative5 libkdecore5 libkdesu5 libkdeui5 libkdewebkit5 libkdnssd4 libkemoticons4 libkfile4 libkhtml5 libkidletime4 libkio5 libkjsapi4 libkjsembed4 libkmediaplayer4 libknewstuff3-4 libknotifyconfig4 libkntlm4 libkparts4 libkpty4 libkrosscore4 libktexteditor4 libkxmlrpcclient4 libnepomuk4 libnepomukcore4abi1 libnepomukquery4a libnepomukutils4 libntrack-qt4-1 libntrack0 libphonon4 libplasma3 libpolkit-qt-1-1 libpoppler-qt4-4 libqapt2 libqapt2-runtime libqca2 libqt4-qt3support libsolid4 libsoprano4 libstreamanalyzer0 libstreams0 libthreadweaver4 libvirtodbc0 nepomuk-core nepomuk-core-data ntrack-module-libnl-0 odbcinst odbcinst1debian2 oxygen-icon-theme phonon phonon-backend-gstreamer plasma-scriptengine-javascript qapt-batch shared-desktop-ontologies soprano-daemon virtuoso-minimal virtuoso-opensource-6.1-bin virtuoso-opensource-6.1-common Use 'apt-get autoremove' to remove them. The following extra packages will be installed: apturl-kde icoutils kate-data katepart kde-runtime kde-runtime-data kdelibs-bin kdelibs5-data kdelibs5-plugins kdesudo kdoctools kubuntu-debug-installer libattica0.4 libdlrestrictions1 libkactivities-bin libkactivities-models1 libkactivities6 libkatepartinterfaces4 libkcmutils4 libkde3support4 libkdeclarative5 libkdecore5 libkdesu5 libkdeui5 libkdewebkit5 libkdnssd4 libkemoticons4 libkfile4 libkhtml5 libkidletime4 libkio5 libkjsapi4 libkjsembed4 libkmediaplayer4 libknewstuff3-4 libknotifyconfig4 libkntlm4 libkparts4 libkpty4 libkrosscore4 libktexteditor4 libkxmlrpcclient4 libnepomuk4 libnepomukcore4abi1 libnepomukquery4a libnepomukutils4 libntrack-qt4-1 libntrack0 libphonon4 libplasma3 libpolkit-qt-1-1 libpoppler-qt4-4 libqapt2 libqapt2-runtime libqca2 libqt4-qt3support libsolid4 libsoprano4 libstreamanalyzer0 libstreams0 libthreadweaver4 libvirtodbc0 libxml2-utils nepomuk-core nepomuk-core-data ntrack-module-libnl-0 odbcinst odbcinst1debian2 oxygen-icon-theme phonon phonon-backend-gstreamer plasma-scriptengine-javascript qapt-batch shared-desktop-ontologies soprano-daemon virtuoso-minimal virtuoso-opensource-6.1-bin virtuoso-opensource-6.1-common Suggested packages: libterm-readline-gnu-perl libterm-readline-perl-perl djvulibre-bin finger hspell libqca2-plugin-cyrus-sasl libqca2-plugin-gnupg libqca2-plugin-ossl phonon-backend-vlc phonon-backend-xine phonon-backend-mplayer The following packages will be REMOVED: aptoncd* apturl* mintupdate* mintwelcome* synaptic* The following NEW packages will be installed: apturl-kde icoutils kate-data katepart kde-runtime kde-runtime-data kdelibs-bin kdelibs5-data kdelibs5-plugins kdesudo kdoctools kubuntu-debug-installer libattica0.4 libdlrestrictions1 libkactivities-bin libkactivities-models1 libkactivities6 libkatepartinterfaces4 libkcmutils4 libkde3support4 libkdeclarative5 libkdecore5 libkdesu5 libkdeui5 libkdewebkit5 libkdnssd4 libkemoticons4 libkfile4 libkhtml5 libkidletime4 libkio5 libkjsapi4 libkjsembed4 libkmediaplayer4 libknewstuff3-4 libknotifyconfig4 libkntlm4 libkparts4 libkpty4 libkrosscore4 libktexteditor4 libkxmlrpcclient4 libnepomuk4 libnepomukcore4abi1 libnepomukquery4a libnepomukutils4 libntrack-qt4-1 libntrack0 libphonon4 libplasma3 libpolkit-qt-1-1 libpoppler-qt4-4 libqapt2 libqapt2-runtime libqca2 libqt4-qt3support libsolid4 libsoprano4 libstreamanalyzer0 libstreams0 libthreadweaver4 libvirtodbc0 libxml2-utils nepomuk-core nepomuk-core-data ntrack-module-libnl-0 odbcinst odbcinst1debian2 oxygen-icon-theme phonon phonon-backend-gstreamer plasma-scriptengine-javascript qapt-batch shared-desktop-ontologies soprano-daemon virtuoso-minimal virtuoso-opensource-6.1-bin virtuoso-opensource-6.1-common 0 upgraded, 78 newly installed, 5 to remove and 0 not upgraded. Need to get 60.9 MB of archives. After this operation, 146 MB of additional disk space will be used. Do you want to continue [Y/n]? n Abort. As you can see, apt-get is trying to install the same packages that it is asking me to autoremove. Could someone please tell me, how to uninstall synaptic properly? Or am I missing something? Just for the record, I also did, sudo apt-get autoremove --purge like it asked me to ... and this is what I got, Reading package lists... Done Building dependency tree Reading state information... Done 0 upgraded, 0 newly installed, 0 to remove and 6 not upgraded.

    Read the article

  • Organization &amp; Architecture UNISA Studies &ndash; Chap 6

    - by MarkPearl
    Learning Outcomes Discuss the physical characteristics of magnetic disks Describe how data is organized and accessed on a magnetic disk Discuss the parameters that play a role in the performance of magnetic disks Describe different optical memory devices Magnetic Disk The way data is stored on and retried from magnetic disks Data is recorded on and later retrieved form the disk via a conducting coil named the head (in many systems there are two heads) The writ mechanism exploits the fact that electricity flowing through a coil produces a magnetic field. Electric pulses are sent to the write head, and the resulting magnetic patterns are recorded on the surface below with different patterns for positive and negative currents The physical characteristics of a magnetic disk   Summarize from book   The factors that play a role in the performance of a disk Seek time – the time it takes to position the head at the track Rotational delay / latency – the time it takes for the beginning of the sector to reach the head Access time – the sum of the seek time and rotational delay Transfer time – the time it takes to transfer data RAID The rate of improvement in secondary storage performance has been considerably less than the rate for processors and main memory. Thus secondary storage has become a bit of a bottleneck. RAID works on the concept that if one disk can be pushed so far, additional gains in performance are to be had by using multiple parallel components. Points to note about RAID… RAID is a set of physical disk drives viewed by the operating system as a single logical drive Data is distributed across the physical drives of an array in a scheme known as striping Redundant disk capacity is used to store parity information, which guarantees data recoverability in case of a disk failure (not supported by RAID 0 or RAID 1) Interesting to note that the increase in the number of drives, increases the probability of failure. To compensate for this decreased reliability RAID makes use of stored parity information that enables the recovery of data lost due to a disk failure.   The RAID scheme consists of 7 levels…   Category Level Description Disks Required Data Availability Large I/O Data Transfer Capacity Small I/O Request Rate Striping 0 Non Redundant N Lower than single disk Very high Very high for both read and write Mirroring 1 Mirrored 2N Higher than RAID 2 – 5 but lower than RAID 6 Higher than single disk Up to twice that of a signle disk for read Parallel Access 2 Redundant via Hamming Code N + m Much higher than single disk Highest of all listed alternatives Approximately twice that of a single disk Parallel Access 3 Bit interleaved parity N + 1 Much higher than single disk Highest of all listed alternatives Approximately twice that of a single disk Independent Access 4 Block interleaved parity N + 1 Much higher than single disk Similar to RAID 0 for read, significantly lower than single disk for write Similar to RAID 0 for read, significantly lower than single disk for write Independent Access 5 Block interleaved parity N + 1 Much higher than single disk Similar to RAID 0 for read, lower than single disk for write Similar to RAID 0 for read, generally  lower than single disk for write Independent Access 6 Block interleaved parity N + 2 Highest of all listed alternatives Similar to RAID 0 for read; lower than RAID 5 for write Similar to RAID 0 for read, significantly lower than RAID 5  for write   Read page 215 – 221 for detailed explanation on RAID levels Optical Memory There are a variety of optical-disk systems available. Read through the table on page 222 – 223 Some of the devices include… CD CD-ROM CD-R CD-RW DVD DVD-R DVD-RW Blue-Ray DVD Magnetic Tape Most modern systems use serial recording – data is lade out as a sequence of bits along each track. The typical recording used in serial is referred to as serpentine recording. In this technique when data is being recorded, the first set of bits is recorded along the whole length of the tape. When the end of the tape is reached the heads are repostioned to record a new track, and the tape is again recorded on its whole length, this time in the opposite direction. That process continued back and forth until the tape is full. To increase speed, the read-write head is capable of reading and writing a number of adjacent tracks simultaneously. Data is still recorded serially along individual tracks, but blocks in sequence are stored on adjacent tracks as suggested. A tape drive is a sequential access device. Magnetic tape was the first kind of secondary memory. It is still widely used as the lowest-cost, slowest speed member of the memory hierarchy.

    Read the article

  • Non-blocking I/O using Servlet 3.1: Scalable applications using Java EE 7 (TOTD #188)

    - by arungupta
    Servlet 3.0 allowed asynchronous request processing but only traditional I/O was permitted. This can restrict scalability of your applications. In a typical application, ServletInputStream is read in a while loop. public class TestServlet extends HttpServlet {    protected void doGet(HttpServletRequest request, HttpServletResponse response)         throws IOException, ServletException {     ServletInputStream input = request.getInputStream();       byte[] b = new byte[1024];       int len = -1;       while ((len = input.read(b)) != -1) {          . . .        }   }} If the incoming data is blocking or streamed slower than the server can read then the server thread is waiting for that data. The same can happen if the data is written to ServletOutputStream. This is resolved in Servet 3.1 (JSR 340, to be released as part Java EE 7) by adding event listeners - ReadListener and WriteListener interfaces. These are then registered using ServletInputStream.setReadListener and ServletOutputStream.setWriteListener. The listeners have callback methods that are invoked when the content is available to be read or can be written without blocking. The updated doGet in our case will look like: AsyncContext context = request.startAsync();ServletInputStream input = request.getInputStream();input.setReadListener(new MyReadListener(input, context)); Invoking setXXXListener methods indicate that non-blocking I/O is used instead of the traditional I/O. At most one ReadListener can be registered on ServletIntputStream and similarly at most one WriteListener can be registered on ServletOutputStream. ServletInputStream.isReady and ServletInputStream.isFinished are new methods to check the status of non-blocking I/O read. ServletOutputStream.canWrite is a new method to check if data can be written without blocking.  MyReadListener implementation looks like: @Overridepublic void onDataAvailable() { try { StringBuilder sb = new StringBuilder(); int len = -1; byte b[] = new byte[1024]; while (input.isReady() && (len = input.read(b)) != -1) { String data = new String(b, 0, len); System.out.println("--> " + data); } } catch (IOException ex) { Logger.getLogger(MyReadListener.class.getName()).log(Level.SEVERE, null, ex); }}@Overridepublic void onAllDataRead() { System.out.println("onAllDataRead"); context.complete();}@Overridepublic void onError(Throwable t) { t.printStackTrace(); context.complete();} This implementation has three callbacks: onDataAvailable callback method is called whenever data can be read without blocking onAllDataRead callback method is invoked data for the current request is completely read. onError callback is invoked if there is an error processing the request. Notice, context.complete() is called in onAllDataRead and onError to signal the completion of data read. For now, the first chunk of available data need to be read in the doGet or service method of the Servlet. Rest of the data can be read in a non-blocking way using ReadListener after that. This is going to get cleaned up where all data read can happen in ReadListener only. The sample explained above can be downloaded from here and works with GlassFish 4.0 build 64 and onwards. The slides and a complete re-run of What's new in Servlet 3.1: An Overview session at JavaOne is available here. Here are some more references for you: Java EE 7 Specification Status Servlet Specification Project JSR Expert Group Discussion Archive Servlet 3.1 Javadocs

    Read the article

  • How can I best implement 'cache until further notice' with memcache in multiple tiers?

    - by ajreal
    the term "client" used here is not referring to client's browser, but client server Before cache workflow 1. client make a HTTP request --> 2. server process --> 3. store parsed results into memcache for next use (cache indefinitely) --> 4. return results to client --> 5. client get the result, store into client's local memcache with TTL After cache workflow 1. another client make a HTTP request --> 2. memcache found return memcache results to client --> 3. client get the result, store into client's local memcache with TTL TTL = time to live Is possible for me to know when the data was updated, and to expire relevant memcache(s) accordingly. However, the pitfalls on client site cache TTL Any data update before the TTL is not pick-up by client memcache. In reverse manner, where there is no update, client memcache still expire after the TTL First request (or concurrent requests) after cache TTL will get throttle as it need to repeat the "Before cache workflow" In the event where client require several HTTP requests on a single web page, it could be very bad in performance. Ideal solution should be client to cache indefinitely until further notice. Here are the three proposals about futher notice Proposal 1 : Make use on HTTP header (current implementation) 1. client sent HTTP request last modified time header 2. server check if last data modified time=last cache time return status 304 3. client based on header to decide further processing GOOD? ---- - save some parsing for client - lesser data transfer BAD? ---- - fire a HTTP request is still slow - server end still need to process lots of requests Proposal 2 : Consistently issue a HTTP request to check all data group last modified time 1. client fire a HTTP request 2. server to return last modified time for all data group 3. client compare local last cache time with the result 4. if data group last cache time < server last modified time then request again for that data group only GOOD? ---- - only fetch what is no up-to-date - less requests for server BAD? ---- - every web page require a HTTP request Proposal 3 : Tell client when new data is available (Push) 1. when server end notice there is a change on a data group 2. notify clients on the changes 3. help clients to fetch again data 4. then reset client local memcache after data is parsed GOOD? ---- - let the cache act/behave like a true cache BAD? ---- - encourage race condition My preference is on proposal 3, and something like Gearman could be ideal Where there is a change, Gearman server to sent the task to multiple clients (workers). Am I crazy? (I know my first question is a bit crazy)

    Read the article

  • Clarification On Write-Caching Policy, Its Underlying Options And How It Applies To Hard Drives And Solid-State Drives

    - by Boris_yo
    In last week after doing more research on subject matter, I have been wondering about what I have been neglecting all those years to understand write-caching policy, always leaving it on default setting. Write-caching policy improves writing performance and consists of write-back caching and write-cache buffer flushing. This is how I understand all the above, but correct me if I erred somewhere: Write-through cache / Write-through caching itself is not a part of write caching policy per se and it's when data is written to both cache and storage device so if Windows will need that data later again, it is retrieved from cache and not from storage device which means only improved read performance as there is no need for waiting for storage device to read required data again. Since data is still written to storage device, write performance isn't improved and represents no risk of data loss or corruption in case of power failure or system crash while only data in cache gets lost. This option seems to be enabled by default and is recommended for removable devices with no need to use function of "Safely Remove Hardware" on user's part. Write-back caching is similar to above but without writing data to storage device, periodically releasing data from cache and writing to storage device when it is idle. In my opinion this option improves both read and write performance but represents risk if power failure or system crash occurs with the outcome of not only losing data eventually to be written to storage device, but causing file inconsistencies or corrupted file system. Write-back caching cannot be enabled together with write-through caching and it is not recommended to be enabled if no backup power supply is availabe. Write-cache buffer flushing I reckon is similar to write-back caching but enables immediate release and writing of data from cache to storage device right before power outage occurs but I don't know if it applies also to occasional system crash. This option seem to be complementary to write-back cache reducing or potentially eliminating risk of data loss or corruption of file system. I have questions about relevance of last 2 options to today's modern SSDs in order to get best performance and with less wear on SSDs: I know that traditional hard drives come with onboard cache (I wonder what type of cache that is), but do SSDs also come with cache? Assuming they do, is this cache faster than their NAND flash and system RAM and worth taking the risk of utilizing it by enabling write-back cache? I read somewhere that generally storage device's cache is faster than RAM, but I want to be sure. Additionally I read that write-caching should be enabled since current data that is to be written later to NAND flash is kept for a while in cache and provided there is data that gets modified a lot before finally being written, holding of this data and its periodic release reduces its write times to SSD thereby reducing its wearing. Now regarding to write-cache buffer flushing, I heard that SSD controllers are so fast by themselves that enabling this option is not required, because they manage flushing. However, once again, I don't know if SSDs have their own onboard cache and whether or not it is faster than their NAND flash and system RAM because if it is, keeping this option enabled would make sense. Recently I have posted question about issue with my Intel 330 SSD 120GB which was main reason to do deeper research having suspicion of write-caching policy being the culprit of SSD's freezing issue assuming data being released is what causes freezes. Currently I have write-cache enabled and write-cache buffer flushing disabled because I believe SSD controller's management of write-cache flushing and Windows write-cache buffer flushing are conflicting with each other: Since I want to troubleshoot in small steps to finally determine the source of issue, I have decided to start with write-caching policy and the move to drivers, switching to AHCI later on and finally disabling DIPM (device initiated power management) through registry modification thanks to @TomWijsman

    Read the article

  • DSOFramer closing Excel doc in another window. If unsaved data in file, dsoframer fails to open with

    - by Steve
    I'm using Microsoft's DSOFramer control to allow me to embed an Excel file in my dialog so the user can choose his sheet, then select his range of cells; it's used with an import button on my dialog. The problem is that when I call the DSOFramer's OPEN function, if I have Excel open in another window, it closes the Excel document (but leaves Excel running). If the document it tries to close has unsaved data, I get a dialog boxclosing Excel doc in another window. If unsaved data in file, dsoframer fails to open with a messagebox: "Attempt to access invalid address". I built the source, and stepped through, and its making a call in its CDsoDocObject::CreateFromFile function, calling BindToObject on an object of class IMoniker. The HR is 0x8001010a "The message filter indicated that the application is busy". On that failure, it tries to InstantiateDocObjectServer by classid of CLSID Microsoft Excel Worksheet... this fails with an HRESULT of 0x80040154 "Class not registered". The InstantiateDocObjectServer just calls CoCreateInstance on the classid, first with CLSCTX_LOCAL_SERVER, then (if that fails) with CLSCTX_INPROC_SERVER. I know DSOFramer is a popular sample project for embedding Office apps in various dialogs and forms. I'm hoping someone else has had this problem and might have some insight on how I can solve this. I really don't want it to close any other open Excel documents, and I really don't want it to error-out if it can't close the document due to unsaved data. Update 1: I've tried changing the classid that's passed in to "Excel.Application" (I know that class will resolve), but that didn't work. In CDsoDocObject, it tries to open key "HKEY_CLASSES_ROOT\CLSID{00024500-0000-0000-C000-000000000046}\DocObject", but fails. I've visually confirmed that the key is not present in my registry; The key is present for the guid, but there's no DocObject subkey. It then produces an error message box: "The associated COM server does not support ActiveX document embedding". I get similar (different key, of course) results when I try to use the Excel.Workbook programid. Update 2: I tried starting a 2nd instance of Excel, hoping that my automation would bind to it (being the most recently invoked) instead of the problem Excel instance, but it didn't seem to do that. Results were the same. My problem seems to have boiled down to this: I'm calling the BindToObject on an object of class IMoniker, and receiving 0x8001010A (RPC_E_SERVERCALL_RETRYLATER) "The message filter indicated that the application is busy". I've tried playing with the flags passed to the BindToObject (via the SetBindOptions), but nothing seems to make any difference. Update 3: It first tries to bind using an IMoniker class. If that fails, it calls CoCreateInstance for the clsid as a "fallback" method. This may work for other MS Office objects, but when it's Excel, the class is for the Worksheet. I modified the sample to CoCreateInstance _Application, then got the workbooks, then called the Workbooks::Open for the target file, which returns a Worksheet object. I then returned that pointer and merged back with the original sample code path. All working now.

    Read the article

  • Improving the performance of XSL

    - by Rachel
    In the below XSL for the variable "insert-data", I have an input param with the structure, <insert-data> <data compareIndex="4" nodeName="d1e1"> <a/> </data> <data compareIndex="5" nodeName="d1e1"> <b/> </data> <data compareIndex="7" nodeName="d1e2"> <a/> </data> <data compareIndex="9" nodeName="d1e2"> <b/> </data> </insert-data> where "nodeName" is the id of a node and "compareIndex" is the position of the text content relative to the node having id "$nodeName". I am using the below XSL to select all the text nodes(generate-id) that satisfy the above condition and construct a data xml. The below implementation works perfectly but the time taken for the execution is in min. Is there a better way of implementing or is there any in-efficient operation being used. From my observation the code where the preceding text length is calculated consumes the major time. Please share your thoughts to improve the performance of the XSL. I am using Java SAXON XSL transformer. <xsl:variable name="insert-data" as="element()*"> <xsl:for-each select="$insert-file/insert-data/data"> <xsl:sort select="xsd:integer(@index)"/> <xsl:variable name="compareIndex" select="xsd:integer(@compareIndex)" /> <xsl:variable name="nodeName" select="@nodeName" /> <xsl:variable name="nodeContent" as="node()"> <xsl:copy-of select="node()"/> </xsl:variable> <xsl:for-each select="$main-root/*//text()[ancestor::*[@id = $nodeName]]"> <xsl:variable name="preTextLength" as="xsd:integer" select="sum((preceding::text())[. ancestor::*[@id = $nodeName]]/string-length(.))" /> <xsl:variable name="currentTextLength" as="xsd:integer" select="string-length(.)" /> <xsl:variable name="sum" select="$preTextLength + $currentTextLength" as="xsd:integer"></xsl:variable> <xsl:variable name="split-index" select="$compareIndex - $preTextLength" as="xsd:integer"></xsl:variable> <xsl:if test="($sum ge $compareIndex) and ($compareIndex gt $preTextLength)"> <data split-index="{$split-index}" text-id="{generate-id(.)}"> <xsl:copy-of select="$nodeContent"/> </data> </xsl:if> </xsl:for-each> </xsl:for-each> </xsl:variable>

    Read the article

  • How do I get a Java to call data from the Internet? Where to even start??

    - by cdg
    Hello oh great wizards of all things android. I really need your help. Mostly because my little brain just doesn't know were to start. I am trying to pull data from the internet to make a widget for the home screen. I have the layout built: <?xml version="1.0" encoding="utf-8"?> <RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android" android:id="@+id/Layout" android:layout_width="fill_parent" android:layout_height="fill_parent" android:background="@drawable/widget_bg_normal" android:clipChildren="false" > <TextView android:id="@+id/text_view" android:layout_width="100px" android:layout_height="wrap_content" android:paddingTop="18px" android:layout_centerHorizontal="true" android:textSize="8px" android:text="158x154 Image downloaded from the internet goes here. Needs to be updated every evening at midnight or unless the button below is pressed. Now if I could only figure out exactly how to do this, life would be good." /> <Button android:id="@+id/new_button" android:layout_width="fill_parent" android:layout_height="wrap_content" android:text="Get New" android:layout_below="@+id/scroll_image" android:layout_centerHorizontal="true" android:padding="0px" android:textSize="10px" android:height="8px" android:includeFontPadding="false" /> </RelativeLayout> Got the provider xml bulit: <?xml version="1.0" encoding="utf-8"?> <appwidget-provider xmlns:android="http://schemas.android.com/apk/res/android" android:minWidth="150dip" android:minHeight="150dip" android:updatePeriodMillis="10000" android:initialLayout="@layout/widget" /> The Manifest works great. <?xml version="1.0" encoding="utf-8"?> <manifest xmlns:android="http://schemas.android.com/apk/res/android" package="com.dge.myandroid" android:versionCode="1" android:versionName="1.0"> <application android:icon="@drawable/icon" android:label="@string/app_name"> <activity android:name=".myactivty" android:label="@string/app_name"> <intent-filter> <action android:name="android.intent.action.MAIN" /> <category android:name="android.intent.category.LAUNCHER" /> </intent-filter> </activity> <!-- Widget --> <receiver android:name=".mywidget" android:label="@string/app_name" > <intent-filter> <action android:name="android.appwidget.action.APPWIDGET_UPDATE" /> </intent-filter> <meta-data android:name="android.appwidget.widgetprovider" android:resource="@xml/widgetprovider" /> </receiver> <!-- Widget End --> </application> <uses-permission android:name="android.permission.INTERNET" /> <uses-sdk android:minSdkVersion="7" /> </manifest> The data it is calling looks something like this when it is called. It basically goes to a website that uses php to random the image: <html><body bgcolor="#000000">center> <a href="http://www.website.com" target="_blank"> <img border="0" src="http://www.webiste.com//0.gif"></a> <img src="http://www.webiste.com" style="border:none;" /> </center></body></html> But here is were I am stuck. I just don't know where to start at all. The java is so far beyond my little head that I don't know what to do. package com.dge.myandroid; import android.appwidget.AppWidgetProvider; public class mywidget extends AppWidgetProvider { } The wiki example just confused me more. I just don't know where to begin. Please help.

    Read the article

  • how can udp data can passed through RS232 in ansi c?

    - by moon
    i want to transmit and receive data on RS232 using udp and i want to know about techniques which allow me to transmit and receive data on a faster rate and also no lose of data is there? thanx in advance. i have tried but need improvements if possible #include <stdio.h> #include <dos.h> #include<string.h> #include<conio.h> #include<iostream.h> #include<stdlib.h> #define PORT1 0x3f8 void main() { int c,ch,choice,i,a=0; char filename[30],filename2[30],buf; FILE *in,*out; clrscr(); while(1){ outportb(PORT1+0,0x03); outportb(PORT1+1,0); outportb(PORT1+3,0x03); outportb(PORT1+2,0xc7); outportb(PORT1+4,0x0b); cout<<"\n==============================================================="; cout<<"\n\t*****Serial Communication By BADR-U-ZAMAN******\nCommunication between two computers By serial port"; cout<<"\nPlease select\n[1]\tFor sending file \n[2]\tFor receiving file \n[3]\tTo exit\n"; cout<<"=================================================================\n"; cin>>choice; if(choice==1) { strcpy(filename,"C:\\TC\\BIN\\badr.cpp"); cout<<filename; for(i=0;i<=strlen(filename);i++) outportb(PORT1,filename[i]); in=fopen(filename,"r"); if (in==NULL) { cout<<"cannot open a file"; a=1; } if(a!=1) cout<<"\n\nFile sending.....\n\n"; while(!feof(in)) { buf=fgetc(in); cout<<buf; outportb(PORT1,buf); delay(5); } } else { if(choice==3) exit(0); i=0; buf='a'; while(buf!=NULL) { c=inportb(PORT1+5); if(c&1) { buf=inportb(PORT1); filename2[i]=buf; i++; } } out=fopen(filename2,"t"); cout<<"\n Filename received:"<<filename[2]; cout<<"\nReading from the port..."; cout<<"writing to file"<<filename2; do { c=inportb(PORT1+5); if(c&1) { buf=inportb(PORT1); cout<<buf; fputc(buf,out); delay(5); } if(kbhit()) { ch=getch(); } }while(ch!=27); } getch(); } }

    Read the article

  • What's the fastest lookup algorithm for a pair data structure (i.e, a map)?

    - by truncheon
    In the following example a std::map structure is filled with 26 values from A - Z (for key) and 0 – 26 for value. The time taken (on my system) to lookup the last entry (10000000 times) is roughly 250 ms for the vector, and 125 ms for the map. (I compiled using release mode, with O3 option turned on for g++ 4.4) But if for some odd reason I wanted better performance than the std::map, what data structures and functions would I need to consider using? I apologize if the answer seems obvious to you, but I haven't had much experience in the performance critical aspects of C++ programming. UPDATE: This example is rather trivial and hides the true complexity of what I'm trying to achieve. My real world project is a simple scripting language that uses a parser, data tree, and interpreter (instead of a VM stack system). I need to use some kind of data structure (perhaps map) to store the variables names created by script programmers. These are likely to be pretty randomly named, so I need a lookup method that can quickly find a particular key within a (probably) fairly large list of names. #include <ctime> #include <map> #include <vector> #include <iostream> struct mystruct { char key; int value; mystruct(char k = 0, int v = 0) : key(k), value(v) { } }; int find(const std::vector<mystruct>& ref, char key) { for (std::vector<mystruct>::const_iterator i = ref.begin(); i != ref.end(); ++i) if (i->key == key) return i->value; return -1; } int main() { std::map<char, int> mymap; std::vector<mystruct> myvec; for (int i = 'a'; i < 'a' + 26; ++i) { mymap[i] = i - 'a'; myvec.push_back(mystruct(i, i - 'a')); } int pre = clock(); for (int i = 0; i < 10000000; ++i) { find(myvec, 'z'); } std::cout << "linear scan: milli " << clock() - pre << "\n"; pre = clock(); for (int i = 0; i < 10000000; ++i) { mymap['z']; } std::cout << "map scan: milli " << clock() - pre << "\n"; return 0; }

    Read the article

  • C++ function will not return

    - by Mike
    I have a function that I am calling that runs all the way up to where it should return but doesn't return. If I cout something for debugging at the very end of the function, it gets displayed but the function does not return. fetchData is the function I am referring to. It gets called by outputFile. cout displays "done here" but not "data fetched" I know this code is messy but can anyone help me figure this out? Thanks //Given an inode return all data of i_block data char* fetchData(iNode tempInode){ char* data; data = new char[tempInode.i_size]; this->currentInodeSize = tempInode.i_size; //Loop through blocks to retrieve data vector<unsigned int> i_blocks; i_blocks.reserve(tempInode.i_blocks); this->currentDataPosition = 0; cout << "currentDataPosition set to 0" << std::endl; cout << "i_blocks:" << tempInode.i_blocks << std::endl; int i = 0; for(i = 0; i < 12; i++){ if(tempInode.i_block[i] == 0) break; i_blocks.push_back(tempInode.i_block[i]); } appendIndirectData(tempInode.i_block[12], &i_blocks); appendDoubleIndirectData(tempInode.i_block[13], &i_blocks); appendTripleIndirectData(tempInode.i_block[14], &i_blocks); //Loop through all the block addresses to get the actual data for(i=0; i < i_blocks.size(); i++){ appendData(i_blocks[i], data); } cout << "done here" << std::endl; return data; } void appendData(int block, char* data){ char* tempBuffer; tempBuffer = new char[this->blockSize]; ifstream file (this->filename, std::ios::binary); int entryLocation = block*this->blockSize; file.seekg (entryLocation, ios::beg); file.read(tempBuffer, this->blockSize); //Append this block to data for(int i=0; i < this->blockSize; i++){ data[this->currentDataPosition] = tempBuffer[i]; this->currentDataPosition++; } data[this->currentDataPosition] = '\0'; } void outputFile(iNode file, string filename){ char* data; cout << "File Transfer Started" << std::endl; data = this->fetchData(file); cout << "data fetched" << std::endl; char *outputFile = (char*)filename.c_str(); ofstream myfile; myfile.open (outputFile,ios::out|ios::binary); int i = 0; for(i=0; i < file.i_size; i++){ myfile << data[i]; } myfile.close(); cout << "File Transfer Completed" << std::endl; return; }

    Read the article

  • Upload File to Windows Azure Blob in Chunks through ASP.NET MVC, JavaScript and HTML5

    - by Shaun
    Originally posted on: http://geekswithblogs.net/shaunxu/archive/2013/07/01/upload-file-to-windows-azure-blob-in-chunks-through-asp.net.aspxMany people are using Windows Azure Blob Storage to store their data in the cloud. Blob storage provides 99.9% availability with easy-to-use API through .NET SDK and HTTP REST. For example, we can store JavaScript files, images, documents in blob storage when we are building an ASP.NET web application on a Web Role in Windows Azure. Or we can store our VHD files in blob and mount it as a hard drive in our cloud service. If you are familiar with Windows Azure, you should know that there are two kinds of blob: page blob and block blob. The page blob is optimized for random read and write, which is very useful when you need to store VHD files. The block blob is optimized for sequential/chunk read and write, which has more common usage. Since we can upload block blob in blocks through BlockBlob.PutBlock, and them commit them as a whole blob with invoking the BlockBlob.PutBlockList, it is very powerful to upload large files, as we can upload blocks in parallel, and provide pause-resume feature. There are many documents, articles and blog posts described on how to upload a block blob. Most of them are focus on the server side, which means when you had received a big file, stream or binaries, how to upload them into blob storage in blocks through .NET SDK.  But the problem is, how can we upload these large files from client side, for example, a browser. This questioned to me when I was working with a Chinese customer to help them build a network disk production on top of azure. The end users upload their files from the web portal, and then the files will be stored in blob storage from the Web Role. My goal is to find the best way to transform the file from client (end user’s machine) to the server (Web Role) through browser. In this post I will demonstrate and describe what I had done, to upload large file in chunks with high speed, and save them as blocks into Windows Azure Blob Storage.   Traditional Upload, Works with Limitation The simplest way to implement this requirement is to create a web page with a form that contains a file input element and a submit button. 1: @using (Html.BeginForm("About", "Index", FormMethod.Post, new { enctype = "multipart/form-data" })) 2: { 3: <input type="file" name="file" /> 4: <input type="submit" value="upload" /> 5: } And then in the backend controller, we retrieve the whole content of this file and upload it in to the blob storage through .NET SDK. We can split the file in blocks and upload them in parallel and commit. The code had been well blogged in the community. 1: [HttpPost] 2: public ActionResult About(HttpPostedFileBase file) 3: { 4: var container = _client.GetContainerReference("test"); 5: container.CreateIfNotExists(); 6: var blob = container.GetBlockBlobReference(file.FileName); 7: var blockDataList = new Dictionary<string, byte[]>(); 8: using (var stream = file.InputStream) 9: { 10: var blockSizeInKB = 1024; 11: var offset = 0; 12: var index = 0; 13: while (offset < stream.Length) 14: { 15: var readLength = Math.Min(1024 * blockSizeInKB, (int)stream.Length - offset); 16: var blockData = new byte[readLength]; 17: offset += stream.Read(blockData, 0, readLength); 18: blockDataList.Add(Convert.ToBase64String(BitConverter.GetBytes(index)), blockData); 19:  20: index++; 21: } 22: } 23:  24: Parallel.ForEach(blockDataList, (bi) => 25: { 26: blob.PutBlock(bi.Key, new MemoryStream(bi.Value), null); 27: }); 28: blob.PutBlockList(blockDataList.Select(b => b.Key).ToArray()); 29:  30: return RedirectToAction("About"); 31: } This works perfect if we selected an image, a music or a small video to upload. But if I selected a large file, let’s say a 6GB HD-movie, after upload for about few minutes the page will be shown as below and the upload will be terminated. In ASP.NET there is a limitation of request length and the maximized request length is defined in the web.config file. It’s a number which less than about 4GB. So if we want to upload a really big file, we cannot simply implement in this way. Also, in Windows Azure, a cloud service network load balancer will terminate the connection if exceed the timeout period. From my test the timeout looks like 2 - 3 minutes. Hence, when we need to upload a large file we cannot just use the basic HTML elements. Besides the limitation mentioned above, the simple HTML file upload cannot provide rich upload experience such as chunk upload, pause and pause-resume. So we need to find a better way to upload large file from the client to the server.   Upload in Chunks through HTML5 and JavaScript In order to break those limitation mentioned above we will try to upload the large file in chunks. This takes some benefit to us such as - No request size limitation: Since we upload in chunks, we can define the request size for each chunks regardless how big the entire file is. - No timeout problem: The size of chunks are controlled by us, which means we should be able to make sure request for each chunk upload will not exceed the timeout period of both ASP.NET and Windows Azure load balancer. It was a big challenge to upload big file in chunks until we have HTML5. There are some new features and improvements introduced in HTML5 and we will use them to implement our solution.   In HTML5, the File interface had been improved with a new method called “slice”. It can be used to read part of the file by specifying the start byte index and the end byte index. For example if the entire file was 1024 bytes, file.slice(512, 768) will read the part of this file from the 512nd byte to 768th byte, and return a new object of interface called "Blob”, which you can treat as an array of bytes. In fact,  a Blob object represents a file-like object of immutable, raw data. The File interface is based on Blob, inheriting blob functionality and expanding it to support files on the user's system. For more information about the Blob please refer here. File and Blob is very useful to implement the chunk upload. We will use File interface to represent the file the user selected from the browser and then use File.slice to read the file in chunks in the size we wanted. For example, if we wanted to upload a 10MB file with 512KB chunks, then we can read it in 512KB blobs by using File.slice in a loop.   Assuming we have a web page as below. User can select a file, an input box to specify the block size in KB and a button to start upload. 1: <div> 2: <input type="file" id="upload_files" name="files[]" /><br /> 3: Block Size: <input type="number" id="block_size" value="512" name="block_size" />KB<br /> 4: <input type="button" id="upload_button_blob" name="upload" value="upload (blob)" /> 5: </div> Then we can have the JavaScript function to upload the file in chunks when user clicked the button. 1: <script type="text/javascript"> 1: 2: $(function () { 3: $("#upload_button_blob").click(function () { 4: }); 5: });</script> Firstly we need to ensure the client browser supports the interfaces we are going to use. Just try to invoke the File, Blob and FormData from the “window” object. If any of them is “undefined” the condition result will be “false” which means your browser doesn’t support these premium feature and it’s time for you to get your browser updated. FormData is another new feature we are going to use in the future. It could generate a temporary form for us. We will use this interface to create a form with chunk and associated metadata when invoked the service through ajax. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: if (window.File && window.Blob && window.FormData) { 4: alert("Your brwoser is awesome, let's rock!"); 5: } 6: else { 7: alert("Oh man plz update to a modern browser before try is cool stuff out."); 8: return; 9: } 10: }); Each browser supports these interfaces by their own implementation and currently the Blob, File and File.slice are supported by Chrome 21, FireFox 13, IE 10, Opera 12 and Safari 5.1 or higher. After that we worked on the files the user selected one by one since in HTML5, user can select multiple files in one file input box. 1: var files = $("#upload_files")[0].files; 2: for (var i = 0; i < files.length; i++) { 3: var file = files[i]; 4: var fileSize = file.size; 5: var fileName = file.name; 6: } Next, we calculated the start index and end index for each chunks based on the size the user specified from the browser. We put them into an array with the file name and the index, which will be used when we upload chunks into Windows Azure Blob Storage as blocks since we need to specify the target blob name and the block index. At the same time we will store the list of all indexes into another variant which will be used to commit blocks into blob in Azure Storage once all chunks had been uploaded successfully. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: // start to upload each files in chunks 5: var files = $("#upload_files")[0].files; 6: for (var i = 0; i < files.length; i++) { 7: var file = files[i]; 8: var fileSize = file.size; 9: var fileName = file.name; 10:  11: // calculate the start and end byte index for each blocks(chunks) 12: // with the index, file name and index list for future using 13: var blockSizeInKB = $("#block_size").val(); 14: var blockSize = blockSizeInKB * 1024; 15: var blocks = []; 16: var offset = 0; 17: var index = 0; 18: var list = ""; 19: while (offset < fileSize) { 20: var start = offset; 21: var end = Math.min(offset + blockSize, fileSize); 22:  23: blocks.push({ 24: name: fileName, 25: index: index, 26: start: start, 27: end: end 28: }); 29: list += index + ","; 30:  31: offset = end; 32: index++; 33: } 34: } 35: }); Now we have all chunks’ information ready. The next step should be upload them one by one to the server side, and at the server side when received a chunk it will upload as a block into Blob Storage, and finally commit them with the index list through BlockBlobClient.PutBlockList. But since all these invokes are ajax calling, which means not synchronized call. So we need to introduce a new JavaScript library to help us coordinate the asynchronize operation, which named “async.js”. You can download this JavaScript library here, and you can find the document here. I will not explain this library too much in this post. We will put all procedures we want to execute as a function array, and pass into the proper function defined in async.js to let it help us to control the execution sequence, in series or in parallel. Hence we will define an array and put the function for chunk upload into this array. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4:  5: // start to upload each files in chunks 6: var files = $("#upload_files")[0].files; 7: for (var i = 0; i < files.length; i++) { 8: var file = files[i]; 9: var fileSize = file.size; 10: var fileName = file.name; 11: // calculate the start and end byte index for each blocks(chunks) 12: // with the index, file name and index list for future using 13: ... ... 14:  15: // define the function array and push all chunk upload operation into this array 16: blocks.forEach(function (block) { 17: putBlocks.push(function (callback) { 18: }); 19: }); 20: } 21: }); 22: }); As you can see, I used File.slice method to read each chunks based on the start and end byte index we calculated previously, and constructed a temporary HTML form with the file name, chunk index and chunk data through another new feature in HTML5 named FormData. Then post this form to the backend server through jQuery.ajax. This is the key part of our solution. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: // start to upload each files in chunks 5: var files = $("#upload_files")[0].files; 6: for (var i = 0; i < files.length; i++) { 7: var file = files[i]; 8: var fileSize = file.size; 9: var fileName = file.name; 10: // calculate the start and end byte index for each blocks(chunks) 11: // with the index, file name and index list for future using 12: ... ... 13: // define the function array and push all chunk upload operation into this array 14: blocks.forEach(function (block) { 15: putBlocks.push(function (callback) { 16: // load blob based on the start and end index for each chunks 17: var blob = file.slice(block.start, block.end); 18: // put the file name, index and blob into a temporary from 19: var fd = new FormData(); 20: fd.append("name", block.name); 21: fd.append("index", block.index); 22: fd.append("file", blob); 23: // post the form to backend service (asp.net mvc controller action) 24: $.ajax({ 25: url: "/Home/UploadInFormData", 26: data: fd, 27: processData: false, 28: contentType: "multipart/form-data", 29: type: "POST", 30: success: function (result) { 31: if (!result.success) { 32: alert(result.error); 33: } 34: callback(null, block.index); 35: } 36: }); 37: }); 38: }); 39: } 40: }); Then we will invoke these functions one by one by using the async.js. And once all functions had been executed successfully I invoked another ajax call to the backend service to commit all these chunks (blocks) as the blob in Windows Azure Storage. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: // start to upload each files in chunks 5: var files = $("#upload_files")[0].files; 6: for (var i = 0; i < files.length; i++) { 7: var file = files[i]; 8: var fileSize = file.size; 9: var fileName = file.name; 10: // calculate the start and end byte index for each blocks(chunks) 11: // with the index, file name and index list for future using 12: ... ... 13: // define the function array and push all chunk upload operation into this array 14: ... ... 15: // invoke the functions one by one 16: // then invoke the commit ajax call to put blocks into blob in azure storage 17: async.series(putBlocks, function (error, result) { 18: var data = { 19: name: fileName, 20: list: list 21: }; 22: $.post("/Home/Commit", data, function (result) { 23: if (!result.success) { 24: alert(result.error); 25: } 26: else { 27: alert("done!"); 28: } 29: }); 30: }); 31: } 32: }); That’s all in the client side. The outline of our logic would be - Calculate the start and end byte index for each chunks based on the block size. - Defined the functions of reading the chunk form file and upload the content to the backend service through ajax. - Execute the functions defined in previous step with “async.js”. - Commit the chunks by invoking the backend service in Windows Azure Storage finally.   Save Chunks as Blocks into Blob Storage In above we finished the client size JavaScript code. It uploaded the file in chunks to the backend service which we are going to implement in this step. We will use ASP.NET MVC as our backend service, and it will receive the chunks, upload into Windows Azure Bob Storage in blocks, then finally commit as one blob. As in the client side we uploaded chunks by invoking the ajax call to the URL "/Home/UploadInFormData", I created a new action under the Index controller and it only accepts HTTP POST request. 1: [HttpPost] 2: public JsonResult UploadInFormData() 3: { 4: var error = string.Empty; 5: try 6: { 7: } 8: catch (Exception e) 9: { 10: error = e.ToString(); 11: } 12:  13: return new JsonResult() 14: { 15: Data = new 16: { 17: success = string.IsNullOrWhiteSpace(error), 18: error = error 19: } 20: }; 21: } Then I retrieved the file name, index and the chunk content from the Request.Form object, which was passed from our client side. And then, used the Windows Azure SDK to create a blob container (in this case we will use the container named “test”.) and create a blob reference with the blob name (same as the file name). Then uploaded the chunk as a block of this blob with the index, since in Blob Storage each block must have an index (ID) associated with so that finally we can put all blocks as one blob by specifying their block ID list. 1: [HttpPost] 2: public JsonResult UploadInFormData() 3: { 4: var error = string.Empty; 5: try 6: { 7: var name = Request.Form["name"]; 8: var index = int.Parse(Request.Form["index"]); 9: var file = Request.Files[0]; 10: var id = Convert.ToBase64String(BitConverter.GetBytes(index)); 11:  12: var container = _client.GetContainerReference("test"); 13: container.CreateIfNotExists(); 14: var blob = container.GetBlockBlobReference(name); 15: blob.PutBlock(id, file.InputStream, null); 16: } 17: catch (Exception e) 18: { 19: error = e.ToString(); 20: } 21:  22: return new JsonResult() 23: { 24: Data = new 25: { 26: success = string.IsNullOrWhiteSpace(error), 27: error = error 28: } 29: }; 30: } Next, I created another action to commit the blocks into blob once all chunks had been uploaded. Similarly, I retrieved the blob name from the Request.Form. I also retrieved the chunks ID list, which is the block ID list from the Request.Form in a string format, split them as a list, then invoked the BlockBlob.PutBlockList method. After that our blob will be shown in the container and ready to be download. 1: [HttpPost] 2: public JsonResult Commit() 3: { 4: var error = string.Empty; 5: try 6: { 7: var name = Request.Form["name"]; 8: var list = Request.Form["list"]; 9: var ids = list 10: .Split(',') 11: .Where(id => !string.IsNullOrWhiteSpace(id)) 12: .Select(id => Convert.ToBase64String(BitConverter.GetBytes(int.Parse(id)))) 13: .ToArray(); 14:  15: var container = _client.GetContainerReference("test"); 16: container.CreateIfNotExists(); 17: var blob = container.GetBlockBlobReference(name); 18: blob.PutBlockList(ids); 19: } 20: catch (Exception e) 21: { 22: error = e.ToString(); 23: } 24:  25: return new JsonResult() 26: { 27: Data = new 28: { 29: success = string.IsNullOrWhiteSpace(error), 30: error = error 31: } 32: }; 33: } Now we finished all code we need. The whole process of uploading would be like this below. Below is the full client side JavaScript code. 1: <script type="text/javascript" src="~/Scripts/async.js"></script> 2: <script type="text/javascript"> 3: $(function () { 4: $("#upload_button_blob").click(function () { 5: // assert the browser support html5 6: if (window.File && window.Blob && window.FormData) { 7: alert("Your brwoser is awesome, let's rock!"); 8: } 9: else { 10: alert("Oh man plz update to a modern browser before try is cool stuff out."); 11: return; 12: } 13:  14: // start to upload each files in chunks 15: var files = $("#upload_files")[0].files; 16: for (var i = 0; i < files.length; i++) { 17: var file = files[i]; 18: var fileSize = file.size; 19: var fileName = file.name; 20:  21: // calculate the start and end byte index for each blocks(chunks) 22: // with the index, file name and index list for future using 23: var blockSizeInKB = $("#block_size").val(); 24: var blockSize = blockSizeInKB * 1024; 25: var blocks = []; 26: var offset = 0; 27: var index = 0; 28: var list = ""; 29: while (offset < fileSize) { 30: var start = offset; 31: var end = Math.min(offset + blockSize, fileSize); 32:  33: blocks.push({ 34: name: fileName, 35: index: index, 36: start: start, 37: end: end 38: }); 39: list += index + ","; 40:  41: offset = end; 42: index++; 43: } 44:  45: // define the function array and push all chunk upload operation into this array 46: var putBlocks = []; 47: blocks.forEach(function (block) { 48: putBlocks.push(function (callback) { 49: // load blob based on the start and end index for each chunks 50: var blob = file.slice(block.start, block.end); 51: // put the file name, index and blob into a temporary from 52: var fd = new FormData(); 53: fd.append("name", block.name); 54: fd.append("index", block.index); 55: fd.append("file", blob); 56: // post the form to backend service (asp.net mvc controller action) 57: $.ajax({ 58: url: "/Home/UploadInFormData", 59: data: fd, 60: processData: false, 61: contentType: "multipart/form-data", 62: type: "POST", 63: success: function (result) { 64: if (!result.success) { 65: alert(result.error); 66: } 67: callback(null, block.index); 68: } 69: }); 70: }); 71: }); 72:  73: // invoke the functions one by one 74: // then invoke the commit ajax call to put blocks into blob in azure storage 75: async.series(putBlocks, function (error, result) { 76: var data = { 77: name: fileName, 78: list: list 79: }; 80: $.post("/Home/Commit", data, function (result) { 81: if (!result.success) { 82: alert(result.error); 83: } 84: else { 85: alert("done!"); 86: } 87: }); 88: }); 89: } 90: }); 91: }); 92: </script> And below is the full ASP.NET MVC controller code. 1: public class HomeController : Controller 2: { 3: private CloudStorageAccount _account; 4: private CloudBlobClient _client; 5:  6: public HomeController() 7: : base() 8: { 9: _account = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("DataConnectionString")); 10: _client = _account.CreateCloudBlobClient(); 11: } 12:  13: public ActionResult Index() 14: { 15: ViewBag.Message = "Modify this template to jump-start your ASP.NET MVC application."; 16:  17: return View(); 18: } 19:  20: [HttpPost] 21: public JsonResult UploadInFormData() 22: { 23: var error = string.Empty; 24: try 25: { 26: var name = Request.Form["name"]; 27: var index = int.Parse(Request.Form["index"]); 28: var file = Request.Files[0]; 29: var id = Convert.ToBase64String(BitConverter.GetBytes(index)); 30:  31: var container = _client.GetContainerReference("test"); 32: container.CreateIfNotExists(); 33: var blob = container.GetBlockBlobReference(name); 34: blob.PutBlock(id, file.InputStream, null); 35: } 36: catch (Exception e) 37: { 38: error = e.ToString(); 39: } 40:  41: return new JsonResult() 42: { 43: Data = new 44: { 45: success = string.IsNullOrWhiteSpace(error), 46: error = error 47: } 48: }; 49: } 50:  51: [HttpPost] 52: public JsonResult Commit() 53: { 54: var error = string.Empty; 55: try 56: { 57: var name = Request.Form["name"]; 58: var list = Request.Form["list"]; 59: var ids = list 60: .Split(',') 61: .Where(id => !string.IsNullOrWhiteSpace(id)) 62: .Select(id => Convert.ToBase64String(BitConverter.GetBytes(int.Parse(id)))) 63: .ToArray(); 64:  65: var container = _client.GetContainerReference("test"); 66: container.CreateIfNotExists(); 67: var blob = container.GetBlockBlobReference(name); 68: blob.PutBlockList(ids); 69: } 70: catch (Exception e) 71: { 72: error = e.ToString(); 73: } 74:  75: return new JsonResult() 76: { 77: Data = new 78: { 79: success = string.IsNullOrWhiteSpace(error), 80: error = error 81: } 82: }; 83: } 84: } And if we selected a file from the browser we will see our application will upload chunks in the size we specified to the server through ajax call in background, and then commit all chunks in one blob. Then we can find the blob in our Windows Azure Blob Storage.   Optimized by Parallel Upload In previous example we just uploaded our file in chunks. This solved the problem that ASP.NET MVC request content size limitation as well as the Windows Azure load balancer timeout. But it might introduce the performance problem since we uploaded chunks in sequence. In order to improve the upload performance we could modify our client side code a bit to make the upload operation invoked in parallel. The good news is that, “async.js” library provides the parallel execution function. If you remembered the code we invoke the service to upload chunks, it utilized “async.series” which means all functions will be executed in sequence. Now we will change this code to “async.parallel”. This will invoke all functions in parallel. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: // start to upload each files in chunks 5: var files = $("#upload_files")[0].files; 6: for (var i = 0; i < files.length; i++) { 7: var file = files[i]; 8: var fileSize = file.size; 9: var fileName = file.name; 10: // calculate the start and end byte index for each blocks(chunks) 11: // with the index, file name and index list for future using 12: ... ... 13: // define the function array and push all chunk upload operation into this array 14: ... ... 15: // invoke the functions one by one 16: // then invoke the commit ajax call to put blocks into blob in azure storage 17: async.parallel(putBlocks, function (error, result) { 18: var data = { 19: name: fileName, 20: list: list 21: }; 22: $.post("/Home/Commit", data, function (result) { 23: if (!result.success) { 24: alert(result.error); 25: } 26: else { 27: alert("done!"); 28: } 29: }); 30: }); 31: } 32: }); In this way all chunks will be uploaded to the server side at the same time to maximize the bandwidth usage. This should work if the file was not very large and the chunk size was not very small. But for large file this might introduce another problem that too many ajax calls are sent to the server at the same time. So the best solution should be, upload the chunks in parallel with maximum concurrency limitation. The code below specified the concurrency limitation to 4, which means at the most only 4 ajax calls could be invoked at the same time. 1: $("#upload_button_blob").click(function () { 2: // assert the browser support html5 3: ... ... 4: // start to upload each files in chunks 5: var files = $("#upload_files")[0].files; 6: for (var i = 0; i < files.length; i++) { 7: var file = files[i]; 8: var fileSize = file.size; 9: var fileName = file.name; 10: // calculate the start and end byte index for each blocks(chunks) 11: // with the index, file name and index list for future using 12: ... ... 13: // define the function array and push all chunk upload operation into this array 14: ... ... 15: // invoke the functions one by one 16: // then invoke the commit ajax call to put blocks into blob in azure storage 17: async.parallelLimit(putBlocks, 4, function (error, result) { 18: var data = { 19: name: fileName, 20: list: list 21: }; 22: $.post("/Home/Commit", data, function (result) { 23: if (!result.success) { 24: alert(result.error); 25: } 26: else { 27: alert("done!"); 28: } 29: }); 30: }); 31: } 32: });   Summary In this post we discussed how to upload files in chunks to the backend service and then upload them into Windows Azure Blob Storage in blocks. We focused on the frontend side and leverage three new feature introduced in HTML 5 which are - File.slice: Read part of the file by specifying the start and end byte index. - Blob: File-like interface which contains the part of the file content. - FormData: Temporary form element that we can pass the chunk alone with some metadata to the backend service. Then we discussed the performance consideration of chunk uploading. Sequence upload cannot provide maximized upload speed, but the unlimited parallel upload might crash the browser and server if too many chunks. So we finally came up with the solution to upload chunks in parallel with the concurrency limitation. We also demonstrated how to utilize “async.js” JavaScript library to help us control the asynchronize call and the parallel limitation.   Regarding the chunk size and the parallel limitation value there is no “best” value. You need to test vary composition and find out the best one for your particular scenario. It depends on the local bandwidth, client machine cores and the server side (Windows Azure Cloud Service Virtual Machine) cores, memory and bandwidth. Below is one of my performance test result. The client machine was Windows 8 IE 10 with 4 cores. I was using Microsoft Cooperation Network. The web site was hosted on Windows Azure China North data center (in Beijing) with one small web role (1.7GB 1 core CPU, 1.75GB memory with 100Mbps bandwidth). The test cases were - Chunk size: 512KB, 1MB, 2MB, 4MB. - Upload Mode: Sequence, parallel (unlimited), parallel with limit (4 threads, 8 threads). - Chunk Format: base64 string, binaries. - Target file: 100MB. - Each case was tested 3 times. Below is the test result chart. Some thoughts, but not guidance or best practice: - Parallel gets better performance than series. - No significant performance improvement between parallel 4 threads and 8 threads. - Transform with binaries provides better performance than base64. - In all cases, chunk size in 1MB - 2MB gets better performance.   Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

    Read the article

  • OWB 11gR2 - Early Arriving Facts

    - by Dawei Sun
    A common challenge when building ETL components for a data warehouse is how to handle early arriving facts. OWB 11gR2 introduced a new feature to address this for dimensional objects entitled Orphan Management. An orphan record is one that does not have a corresponding existing parent record. Orphan management automates the process of handling source rows that do not meet the requirements necessary to form a valid dimension or cube record. In this article, a simple example will be provided to show you how to use Orphan Management in OWB. We first import a sample MDL file that contains all the objects we need. Then we take some time to examine all the objects. After that, we prepare the source data, deploy the target table and dimension/cube loading map. Finally, we run the loading maps, and check the data in target dimension/cube tables. OK, let’s start… 1. Import MDL file and examine sample project First, download zip file from here, which includes a MDL file and three source data files. Then we open OWB design center, import orphan_management.mdl by using the menu File->Import->Warehouse Builder Metadata. Now we have several objects in BI_DEMO project as below: Mapping LOAD_CHANNELS_OM: The mapping for dimension loading. Mapping LOAD_SALES_OM: The mapping for cube loading. Dimension CHANNELS_OM: The dimension that contains channels data. Cube SALES_OM: The cube that contains sales data. Table CHANNELS_OM: The star implementation table of dimension CHANNELS_OM. Table SALES_OM: The star implementation table of cube SALES_OM. Table SRC_CHANNELS: The source table of channels data, that will be loaded into dimension CHANNELS_OM. Table SRC_ORDERS and SRC_ORDER_ITEMS: The source tables of sales data that will be loaded into cube SALES_OM. Sequence CLASS_OM_DIM_SEQ: The sequence used for loading dimension CHANNELS_OM. Dimension CHANNELS_OM This dimension has a hierarchy with three levels: TOTAL, CLASS and CHANNEL. Each level has three attributes: ID (surrogate key), NAME and SOURCE_ID (business key). It has a standard star implementation. The orphan management policy and the default parent setting are shown in the following screenshots: The orphan management policy options that you can set for loading are: Reject Orphan: The record is not inserted. Default Parent: You can specify a default parent record. This default record is used as the parent record for any record that does not have an existing parent record. If the default parent record does not exist, Warehouse Builder creates the default parent record. You specify the attribute values of the default parent record at the time of defining the dimensional object. If any ancestor of the default parent does not exist, Warehouse Builder also creates this record. No Maintenance: This is the default behavior. Warehouse Builder does not actively detect, reject, or fix orphan records. While removing data from a dimension, you can select one of the following orphan management policies: Reject Removal: Warehouse Builder does not allow you to delete the record if it has existing child records. No Maintenance: This is the default behavior. Warehouse Builder does not actively detect, reject, or fix orphan records. (More details are at http://download.oracle.com/docs/cd/E11882_01/owb.112/e10935/dim_objects.htm#insertedID1) Cube SALES_OM This cube is references to dimension CHANNELS_OM. It has three measures: AMOUNT, QUANTITY and COST. The orphan management policy setting are shown as following screenshot: The orphan management policy options that you can set for loading are: No Maintenance: Warehouse Builder does not actively detect, reject, or fix orphan rows. Default Dimension Record: Warehouse Builder assigns a default dimension record for any row that has an invalid or null dimension key value. Use the Settings button to define the default parent row. Reject Orphan: Warehouse Builder does not insert the row if it does not have an existing dimension record. (More details are at http://download.oracle.com/docs/cd/E11882_01/owb.112/e10935/dim_objects.htm#BABEACDG) Mapping LOAD_CHANNELS_OM This mapping loads source data from table SRC_CHANNELS to dimension CHANNELS_OM. The operator CHANNELS_IN is bound to table SRC_CHANNELS; CHANNELS_OUT is bound to dimension CHANNELS_OM. The TOTALS operator is used for generating a constant value for the top level in the dimension. The CLASS_FILTER operator is used to filter out the “invalid” class name, so then we can see what will happen when those channel records with an “invalid” parent are loading into dimension. Some properties of the dimension operator in this mapping are important to orphan management. See the screenshot below: Create Default Level Records: If YES, then default level records will be created. This property must be set to YES for dimensions and cubes if one of their orphan management policies is “Default Parent” or “Default Dimension Record”. This property is set to NO by default, so the user may need to set this to YES manually. LOAD policy for INVALID keys/ LOAD policy for NULL keys: These two properties have the same meaning as in the dimension editor. The values are set to the same as the dimension value when user drops the dimension into the mapping. The user does not need to modify these properties. Record Error Rows: If YES, error rows will be inserted into error table when loading the dimension. REMOVE Orphan Policy: This property is used when removing data from a dimension. Since the dimension loading type is set to LOAD in this example, this property is disabled. Mapping LOAD_SALES_OM This mapping loads source data from table SRC_ORDERS and SRC_ORDER_ITEMS to cube SALES_OM. This mapping seems a little bit complicated, but operators in the red rectangle are used to filter out and generate the records with “invalid” or “null” dimension keys. Some properties of the cube operator in a mapping are important to orphan management. See the screenshot below: Enable Source Aggregation: Should be checked in this example. If the default dimension record orphan policy is set for the cube operator, then it is recommended that source aggregation also be enabled. Otherwise, the orphan management processing may produce multiple fact rows with the same default dimension references, which will cause an “unstable rowset” execution error in the database, since the dimension refs are used as update match attributes for updating the fact table. LOAD policy for INVALID keys/ LOAD policy for NULL keys: These two properties have the same meaning as in the cube editor. The values are set to the same as in the cube editor when the user drops the cube into the mapping. The user does not need to modify these properties. Record Error Rows: If YES, error rows will be inserted into error table when loading the cube. 2. Deploy objects and mappings We now can deploy the objects. First, make sure location SALES_WH_LOCAL has been correctly configured. Then open Control Center Manager by using the menu Tools->Control Center Manager. Expand BI_DEMO->SALES_WH_LOCAL, click SALES_WH node on the project tree. We can see the following objects: Deploy all the objects in the following order: Sequence CLASS_OM_DIM_SEQ Table CHANNELS_OM, SALES_OM, SRC_CHANNELS, SRC_ORDERS, SRC_ORDER_ITEMS Dimension CHANNELS_OM Cube SALES_OM Mapping LOAD_CHANNELS_OM, LOAD_SALES_OM Note that we deployed source tables as well. Normally, we import source table from database instead of deploying them to target schema. However, in this example, we designed the source tables in OWB and deployed them to database for the purpose of this demonstration. 3. Prepare and examine source data Before running the mappings, we need to populate and examine the source data first. Run SRC_CHANNELS.sql, SRC_ORDERS.sql and SRC_ORDER_ITEMS.sql as target user. Then we check the data in these three tables. Table SRC_CHANNELS SQL> select rownum, id, class, name from src_channels; Records 1~5 are correct; they should be loaded into dimension without error. Records 6,7 and 8 have null parents; they should be loaded into dimension with a default parent value, and should be inserted into error table at the same time. Records 9, 10 and 11 have “invalid” parents; they should be rejected by dimension, and inserted into error table. Table SRC_ORDERS and SRC_ORDER_ITEMS SQL> select rownum, a.id, a.channel, b.amount, b.quantity, b.cost from src_orders a, src_order_items b where a.id = b.order_id; Record 178 has null dimension reference; it should be loaded into cube with a default dimension reference, and should be inserted into error table at the same time. Record 179 has “invalid” dimension reference; it should be rejected by cube, and inserted into error table. Other records should be aggregated and loaded into cube correctly. 4. Run the mappings and examine the target data In the Control Center Manager, expand BI_DEMO-> SALES_WH_LOCAL-> SALES_WH-> Mappings, right click on LOAD_CHANNELS_OM node, click Start. Use the same way to run mapping LOAD_SALES_OM. When they successfully finished, we can check the data in target tables. Table CHANNELS_OM SQL> select rownum, total_id, total_name, total_source_id, class_id,class_name, class_source_id, channel_id, channel_name,channel_source_id from channels_om order by abs(dimension_key); Records 1,2 and 3 are the default dimension records for the three levels. Records 8, 10 and 15 are the loaded records that originally have null parents. We see their parents name (class_name) is set to DEF_CLASS_NAME. Those records whose CHANNEL_NAME are Special_4, Special_5 and Special_6 are not loaded to this table because of the invalid parent. Error Table CHANNELS_OM_ERR SQL> select rownum, class_source_id, channel_id, channel_name,channel_source_id, err$$$_error_reason from channels_om_err order by channel_name; We can see all the record with null parent or invalid parent are inserted into this error table. Error reason is “Default parent used for record” for the first three records, and “No parent found for record” for the last three. Table SALES_OM SQL> select a.*, b.channel_name from sales_om a, channels_om b where a.channels=b.channel_id; We can see the order record with null channel_name has been loaded into target table with a default channel_name. The one with “invalid” channel_name are not loaded. Error Table SALES_OM_ERR SQL> select a.amount, a.cost, a.quantity, a.channels, b.channel_name, a.err$$$_error_reason from sales_om_err a, channels_om b where a.channels=b.channel_id(+); We can see the order records with null or invalid channel_name are inserted into error table. If the dimension reference column is null, the error reason is “Default dimension record used for fact”. If it is invalid, the error reason is “Dimension record not found for fact”. Summary In summary, this article illustrated the Orphan Management feature in OWB 11gR2. Automated orphan management policies improve ETL developer and administrator productivity by addressing an important cause of cube and dimension load failures, without requiring developers to explicitly build logic to handle these orphan rows.

    Read the article

  • quick look at: dm_db_index_physical_stats

    - by fatherjack
    A quick look at the key data from this dmv that can help a DBA keep databases performing well and systems online as the users need them. When the dynamic management views relating to index statistics became available in SQL Server 2005 there was much hype about how they can help a DBA keep their servers running in better health than ever before. This particular view gives an insight into the physical health of the indexes present in a database. Whether they are use or unused, complete or missing some columns is irrelevant, this is simply the physical stats of all indexes; disabled indexes are ignored however. In it’s simplest form this dmv can be executed as:   The results from executing this contain a record for every index in every database but some of the columns will be NULL. The first parameter is there so that you can specify which database you want to gather index details on, rather than scan every database. Simply specifying DB_ID() in place of the first NULL achieves this. In order to avoid the NULLS, or more accurately, in order to choose when to have the NULLS you need to specify a value for the last parameter. It takes one of 4 values – DEFAULT, ‘SAMPLED’, ‘LIMITED’ or ‘DETAILED’. If you execute the dmv with each of these values you can see some interesting details in the times taken to complete each step. DECLARE @Start DATETIME DECLARE @First DATETIME DECLARE @Second DATETIME DECLARE @Third DATETIME DECLARE @Finish DATETIME SET @Start = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, DEFAULT) AS ddips SET @First = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, 'SAMPLED') AS ddips SET @Second = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, 'LIMITED') AS ddips SET @Third = GETDATE() SELECT * FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, 'DETAILED') AS ddips SET @Finish = GETDATE() SELECT DATEDIFF(ms, @Start, @First) AS [DEFAULT] , DATEDIFF(ms, @First, @Second) AS [SAMPLED] , DATEDIFF(ms, @Second, @Third) AS [LIMITED] , DATEDIFF(ms, @Third, @Finish) AS [DETAILED] Running this code will give you 4 result sets; DEFAULT will have 12 columns full of data and then NULLS in the remainder. SAMPLED will have 21 columns full of data. LIMITED will have 12 columns of data and the NULLS in the remainder. DETAILED will have 21 columns full of data. So, from this we can deduce that the DEFAULT value (the same one that is also applied when you query the view using a NULL parameter) is the same as using LIMITED. Viewing the final result set has some details that are worth noting: Running queries against this view takes significantly longer when using the SAMPLED and DETAILED values in the last parameter. The duration of the query is directly related to the size of the database you are working in so be careful running this on big databases unless you have tried it on a test server first. Let’s look at the data we get back with the DEFAULT value first of all and then progress to the extra information later. We know that the first parameter that we supply has to be a database id and for the purposes of this blog we will be providing that value with the DB_ID function. We could just as easily put a fixed value in there or a function such as DB_ID (‘AnyDatabaseName’). The first columns we get back are database_id and object_id. These are pretty explanatory and we can wrap those in some code to make things a little easier to read: SELECT DB_NAME([ddips].[database_id]) AS [DatabaseName] , OBJECT_NAME([ddips].[object_id]) AS [TableName] … FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, NULL) AS ddips  gives us   SELECT DB_NAME([ddips].[database_id]) AS [DatabaseName] , OBJECT_NAME([ddips].[object_id]) AS [TableName], [i].[name] AS [IndexName] , ….. FROM [sys].[dm_db_index_physical_stats](DB_ID(), NULL, NULL, NULL, NULL) AS ddips INNER JOIN [sys].[indexes] AS i ON [ddips].[index_id] = [i].[index_id] AND [ddips].[object_id] = [i].[object_id]     These handily tie in with the next parameters in the query on the dmv. If you specify an object_id and an index_id in these then you get results limited to either the table or the specific index. Once again we can place a  function in here to make it easier to work with a specific table. eg. SELECT * FROM [sys].[dm_db_index_physical_stats] (DB_ID(), OBJECT_ID(‘AdventureWorks2008.Person.Address’) , 1, NULL, NULL) AS ddips   Note: Despite me showing that functions can be placed directly in the parameters for this dmv, best practice recommends that functions are not used directly in the function as it is possible that they will fail to return a valid object ID. To be certain of not passing invalid values to this function, and therefore setting an automated process off on the wrong path, declare variables for the OBJECT_IDs and once they have been validated, use them in the function: DECLARE @db_id SMALLINT; DECLARE @object_id INT; SET @db_id = DB_ID(N’AdventureWorks_2008′); SET @object_id = OBJECT_ID(N’AdventureWorks_2008.Person.Address’); IF @db_id IS NULL BEGINPRINT N’Invalid database’; ENDELSE IF @object_id IS NULL BEGINPRINT N’Invalid object’; ENDELSE BEGINSELECT * FROM sys.dm_db_index_physical_stats (@db_id, @object_id, NULL, NULL , ‘LIMITED’); END; GO In cases where the results of querying this dmv don’t have any effect on other processes (i.e. simply viewing the results in the SSMS results area)  then it will be noticed when the results are not consistent with the expected results and in the case of this blog this is the method I have used. So, now we can relate the values in these columns to something that we recognise in the database lets see what those other values in the dmv are all about. The next columns are: We’ll skip partition_number, index_type_desc, alloc_unit_type_desc, index_depth and index_level  as this is a quick look at the dmv and they are pretty self explanatory. The final columns revealed by querying this view in the DEFAULT mode are avg_fragmentation_in_percent. This is the amount that the index is logically fragmented. It will show NULL when the dmv is queried in SAMPLED mode. fragment_count. The number of pieces that the index is broken into. It will show NULL when the dmv is queried in SAMPLED mode. avg_fragment_size_in_pages. The average size, in pages, of a single fragment in the leaf level of the IN_ROW_DATA allocation unit. It will show NULL when the dmv is queried in SAMPLED mode. page_count. Total number of index or data pages in use. OK, so what does this give us? Well, there is an obvious correlation between fragment_count, page_count and avg_fragment_size-in_pages. We see that an index that takes up 27 pages and is in 3 fragments has an average fragment size of 9 pages (27/3=9). This means that for this index there are 3 separate places on the hard disk that SQL Server needs to locate and access to gather the data when it is requested by a DML query. If this index was bigger than 72KB then having it’s data in 3 pieces might not be too big an issue as each piece would have a significant piece of data to read and the speed of access would not be too poor. If the number of fragments increases then obviously the amount of data in each piece decreases and that means the amount of work for the disks to do in order to retrieve the data to satisfy the query increases and this would start to decrease performance. This information can be useful to keep in mind when considering the value in the avg_fragmentation_in_percent column. This is arrived at by an internal algorithm that gives a value to the logical fragmentation of the index taking into account the multiple files, type of allocation unit and the previously mentioned characteristics if index size (page_count) and fragment_count. Seeing an index with a high avg_fragmentation_in_percent value will be a call to action for a DBA that is investigating performance issues. It is possible that tables will have indexes that suffer from rapid increases in fragmentation as part of normal daily business and that regular defragmentation work will be needed to keep it in good order. In other cases indexes will rarely become fragmented and therefore not need rebuilding from one end of the year to another. Keeping this in mind DBAs need to use an ‘intelligent’ process that assesses key characteristics of an index and decides on the best, if any, defragmentation method to apply should be used. There is a simple example of this in the sample code found in the Books OnLine content for this dmv, in example D. There are also a couple of very popular solutions created by SQL Server MVPs Michelle Ufford and Ola Hallengren which I would wholly recommend that you review for much further detail on how to care for your SQL Server indexes. Right, let’s get back on track then. Querying the dmv with the fifth parameter value as ‘DETAILED’ takes longer because it goes through the index and refreshes all data from every level of the index. As this blog is only a quick look a we are going to skate right past ghost_record_count and version_ghost_record_count and discuss avg_page_space_used_in_percent, record_count, min_record_size_in_bytes, max_record_size_in_bytes and avg_record_size_in_bytes. We can see from the details below that there is a correlation between the columns marked. Column 1 (Page_Count) is the number of 8KB pages used by the index, column 2 is how full each page is (how much of the 8KB has actual data written on it), column 3 is how many records are recorded in the index and column 4 is the average size of each record. This approximates to: ((Col1*8) * 1024*(Col2/100))/Col3 = Col4*. avg_page_space_used_in_percent is an important column to review as this indicates how much of the disk that has been given over to the storage of the index actually has data on it. This value is affected by the value given for the FILL_FACTOR parameter when creating an index. avg_record_size_in_bytes is important as you can use it to get an idea of how many records are in each page and therefore in each fragment, thus reinforcing how important it is to keep fragmentation under control. min_record_size_in_bytes and max_record_size_in_bytes are exactly as their names set them out to be. A detail of the smallest and largest records in the index. Purely offered as a guide to the DBA to better understand the storage practices taking place. So, keeping an eye on avg_fragmentation_in_percent will ensure that your indexes are helping data access processes take place as efficiently as possible. Where fragmentation recurs frequently then potentially the DBA should consider; the fill_factor of the index in order to leave space at the leaf level so that new records can be inserted without causing fragmentation so rapidly. the columns used in the index should be analysed to avoid new records needing to be inserted in the middle of the index but rather always be added to the end. * – it’s approximate as there are many factors associated with things like the type of data and other database settings that affect this slightly.  Another great resource for working with SQL Server DMVs is Performance Tuning with SQL Server Dynamic Management Views by Louis Davidson and Tim Ford – a free ebook or paperback from Simple Talk. Disclaimer – Jonathan is a Friend of Red Gate and as such, whenever they are discussed, will have a generally positive disposition towards Red Gate tools. Other tools are often available and you should always try others before you come back and buy the Red Gate ones. All code in this blog is provided “as is” and no guarantee, warranty or accuracy is applicable or inferred, run the code on a test server and be sure to understand it before you run it on a server that means a lot to you or your manager.

    Read the article

  • WCF - The maximum nametable character count quota (16384) has been exceeded while reading XML data.

    - by Jankhana
    I'm having a WCF Service that uses wsHttpBinding. The server configuration is as follows : <bindings> <wsHttpBinding> <binding name="wsHttpBinding" maxBufferPoolSize="2147483647" maxReceivedMessageSize="2147483647"> <readerQuotas maxDepth="2147483647" maxStringContentLength="2147483647" maxArrayLength="2147483647" maxBytesPerRead="2147483647" maxNameTableCharCount="2147483647" /> <security mode="None"> <transport clientCredentialType="Windows" proxyCredentialType="None" realm="" /> <message clientCredentialType="Windows" negotiateServiceCredential="true" algorithmSuite="Default" establishSecurityContext="true" /> </security> </binding> </wsHttpBinding> </bindings> At the client side I'm including the Service reference of the WCF-Service. It works great if I have limited functions say 90 Operation Contract in my IService but if add one more OperationContract than I'm unable to Update the Service reference nor i'm able to add that service reference. In this article it's mentioned that by changing those config files(i.e devenv.exe.config, WcfTestClient.exe.config and SvcUtil.exe.config) it will work but even including those bindings in those config files still that error pops up saying There was an error downloading 'http://10.0.3.112/MyService/Service1.svc/mex'. The request failed with HTTP status 400: Bad Request. Metadata contains a reference that cannot be resolved: 'http://10.0.3.112/MyService/Service1.svc/mex'. There is an error in XML document (1, 89549). The maximum nametable character count quota (16384) has been exceeded while reading XML data. The nametable is a data structure used to store strings encountered during XML processing - long XML documents with non-repeating element names, attribute names and attribute values may trigger this quota. This quota may be increased by changing the MaxNameTableCharCount property on the XmlDictionaryReaderQuotas object used when creating the XML reader. Line 1, position 89549. If the service is defined in the current solution, try building the solution and adding the service reference again. Any idea how to solve this????

    Read the article

  • Why doesn't Data Driven Subscription in SSRS 2005 like my Stored Procedure?

    - by bert
    I'm trying to define a Data Driven Subscription for a report in SSRS 2005. In Step 3 of the set up you're asked for: " a command or query that returns a list of recipients and optionally returns fields used to vary delivery settings and report parameter values for each recipient" This I have written and it returns the data without a hitch. I press next and it rolls onto the next screen in the set up which has all the variables to set for the DDS and in each case it has an option to "Select Value From Database" I select this radio button and press the drop down. No fields are available to me. Now the only way I could vary the number of parameters returned by the SP was to have the SP write the SQL to an nvarchar variable and then at the end execute the variable as sql. I have tested this in the Management Studio and it returns the expected fields. I even named them after the fields in SSRS but the thing won't put the field names into the dropdowns. I've even taken the query body out of the Stored Proc, verified it in SSRS and then tried that. It doesn't work either. Can anyone shed any light into what I'm doing wrong?

    Read the article

  • A catalogue of Cassandra log messages: What is the correct interpretation?

    - by knorv
    The following is a complete catalogue of all log messages generated by Cassandra 0.6 when stress-testing a Cassandra installation over an extended period of time: AntiEntropyService: Sending AEService tree for (,) to: [] CassandraDaemon: Binding thrift service to localhost/N.N.N.N:N CassandraDaemon: Cassandra starting up... ColumnFamilyStore: has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='.../cassandra/commitlog/CommitLog-N.log', position=N) ColumnFamilyStore: Enqueuing flush of Memtable()@N CommitLog: Discarding obsolete commit log:CommitLogSegment(.../cassandra/commitlog/CommitLog-N.log) CommitLog: Log replay complete CommitLog: Replaying .../cassandra/commitlog/CommitLog-N.log, ... CommitLogSegment: Creating new commitlog segment .../cassandra/commitlog/CommitLog-N.log CompactionManager: Compacted to .../cassandra/data//-N-Data.db. N/N bytes for N keys. Time: Nms. CompactionManager: Compacting [org.apache.cassandra.io.SSTableReader(path='.../cassandra/data//-N-Data.db'), ...] DatabaseDescriptor: Auto DiskAccessMode determined to be mmap GCInspector: GC for ConcurrentMarkSweep: N ms, N reclaimed leaving N used; max is N GCInspector: GC for ParNew: N ms, N reclaimed leaving N used; max is N Memtable: Completed flushing .../cassandra/data//-N-Data.db Memtable: Writing Memtable()@N SSTable: Deleted .../cassandra/data//-N-Data.db SSTableDeletingReference: Deleted .../cassandra/data//-N-Data.db SSTableReader: Sampling index for .../cassandra/data//-N-Data.db StorageService: Starting up server gossip SystemTable: Saved ClusterName found: Test Cluster SystemTable: Saved ClusterName not found. Using Test Cluster SystemTable: Saved Token found: N SystemTable: Saved Token not found. Using N For each of the log messages listed - what is the correct interpretation of the log message?

    Read the article

  • iPhone: Memory leak when using NSOperationQueue...

    - by MacTouch
    Hi, I'm sitting here for at least half an hour to find a memory leak in my code. I just replaced an synchronous call to a (touch) method with an asynchronous one using NSOperationQueue. The Leak Inspector reports a memory leak after I did the change to the code. What's wrong with the version using NSOperationQueue? Version without a MemoryLeak -(NSData *)dataForKey:(NSString*)ressourceId_ { NSString *path = [self cachePathForKey:ressourceId_]; // returns an autoreleased NSString* NSString *cacheKey = [self cacheKeyForRessource:ressourceId_]; // returns an autoreleased NSString* NSData *data = [[self memoryCache] valueForKey:cacheKey]; if (!data) { data = [self loadData:path]; // returns an autoreleased NSData* if (data) { [[self memoryCache] setObject:data forKey:cacheKey]; } } [[self touch:path]; return data; } Version with a MemoryLeak (I do not see any) -(NSData *)dataForKey:(NSString*)ressourceId_ { NSString *path = [self cachePathForKey:ressourceId_]; // returns an autoreleased NSString* NSString *cacheKey = [self cacheKeyForRessource:ressourceId_]; // returns an autoreleased NSString* NSData *data = [[self memoryCache] valueForKey:cacheKey]; if (!data) { data = [self loadData:path]; // returns an autoreleased NSData* if (data) { [[self memoryCache] setObject:data forKey:cacheKey]; } } NSInvocationOperation *touchOp = [[NSInvocationOperation alloc] initWithTarget:self selector:@selector(touch:) object:path]; [[self operationQueue] addOperation:touchOp]; [touchOp release]; return data; } And of course, the touch method does nothing special too. Just change the date of the file. -(void)touch:(id)path_ { NSString *path = (NSString *)path_ NSFileManager *fm = [NSFileManager defaultManager]; if ([fm fileExistsAtPath:path]) { NSDictionary *attributes = [NSDictionary dictionaryWithObjectsAndKeys:[NSDate date], NSFileModificationDate, nil]; [fm setAttributes: attributes ofItemAtPath:path error:nil]; } }

    Read the article

  • File Format Conversion to TIFF. Some issues???

    - by Nains
    I'm having a proprietary image format SNG( a proprietary format) which is having a countinous array of Image data along with Image meta information in seperate HDR file. Now I need to convert this SNG format to a Standard TIFF 6.0 Format. So I studied the TIFF format i.e. about its Header, Image File Directories( IFD's) and Stripped Image Data. Now I have few concerns about this conversion. Please assist me. SNG Continous Data vs TIFF Stripped Data: Should I convert SNG Data to TIFF as a continous data in one Strip( data load/edit time problem?) OR make logical StripOffsets of the SNG Image data. SNG Data Header uses only necessary Meta Information, thus while converting the SNG to TIFF, some information can’t be retrieved such as NewSubFileType, Software Tag etc. So this raises a concern that after conversion whether any missing directory information such as NewSubFileType, Software Tag etc is necessary and sufficient condition for TIFF File. Encoding of each pixel component of RGB Sample in SNG data: Here each SNG Image Data Strip per Pixel component is encoded as: Out^[i] := round( LineBuffer^[i * 3] * **0.072169** + LineBuffer^[i * 3 + 1] * **0.715160** + LineBuffer^[i * 3+ 2]* **0.212671**); Only way I deduce from it is that each Pixel is represented with 3 RGB component and some coefficient is multiplied with each component to make the SNG Viewer work RGB color information of SNG Image Data. (Developer who earlier work on this left, now i am following the trace :)) Thus while converting this to TIFF, the decoding the same needs to be done. This raises a concern that the how RBG information in TIFF is produced, or better do we need this information?. Please assist...

    Read the article

  • jQuery autocomplete: taking JSON input and setting multiple fields from single field

    - by Sly
    I am trying to get the jQuery autocomplete plugin to take a local JSON variable as input. Once the user has selected one option from the autocomplete list, I want the adjacent address fields to be autopopulated. Here's the JSON variable that declared as a global variable in the of the HTML file: var JSON_address={"1":{"origin":{"nametag":"Home","street":"Easy St","city":"Emerald City","state":"CA","zip":"9xxxx"},"destination":{"nametag":"Work","street":"Factory St","city":"San Francisco","state":"CA","zip":"94104"}},"2":{"origin":{"nametag":"Work","street":"Umpa Loompa St","city":"San Francisco","state":"CA","zip":"94104"},"destination":{"nametag":"Home","street":"Easy St","city":"Emerald City ","state":"CA","zip":"9xxxx"}}}</script> I want the first field to display a list of "origin" nametags: "Home", "Work". Then when "Home" is selected, adjacent fields are automatically populated with Street: Easy St, City: Emerald City, etc. Here's the code I have for the autocomplete: $(document).ready(function(){ $("#origin_nametag_id").autocomplete(JSON_address, { autoFill:true, minChars:0, dataType: 'json', parse: function(data) { var rows = new Array(); for (var i=0; i<=data.length; i++) { rows[rows.length] = { data:data[i], value:data[i].origin.nametag, result:data[i].origin.nametag }; } return rows; } }).change(function(){ $("#street_address_id").autocomplete({ dataType: 'json', parse: function(data) { var rows = new Array(); for (var i=0; i<=data.length; i++) { rows[rows.length] = { data:data[i], value:data[i].origin.street, result:data[i].origin.street }; } return rows; } }); }); }); So this question really has two subparts: 1) How do you get autocomplete to process the multi-dimensional JSON object? (When the field is clicked and text entered, nothing happens - no list) 2) How do you get the other fields (street, city, etc) to populate based upon the origin nametag sub-array? Thanks in advance!

    Read the article

  • Is this a good approach to address double-base64-encoding?

    - by Freiheit
    My software understands attachments, like PNGs attached to user records. These attachments are usually sent in from outside sources as a Base64 encoded string. The database stores whatever data it is given, Base64 encoded or not. When I serve up the attachment for download I do this: if (Base64.isBase64(data)) { data = Base64.decodeBase64(data); } There is a potential for data that is double encoded. For instance the sender of a message had base64 encoded data, then encoded it again when building the message to send to me. I think the following code would address that circumstance: while (Base64.isBase64(data)) { data = Base64.decodeBase64(data); } So if data is encoded multiple times, it would be decoded until its in its 'raw' state and then served up for download. Is this approach an acceptable way to address that problem? Ideally some sort of checking could happen at the edge when I receive attachment data, but that will take more time. This looping seems to be a faster way to do it. The 'Base64' library is Apache Commons: http://commons.apache.org/codec/apidocs/org/apache/commons/codec/binary/Base64.html I trust it to properly identify Base64 encoded data.

    Read the article

  • How can a data ellipse be superimposed on a ggplot2 scatterplot?

    - by Radu
    Hi, I have an R function which produces 95% confidence ellipses for scatterplots. The output looks like this, having a default of 50 points for each ellipse (50 rows): [,1] [,2] [1,] 0.097733810 0.044957994 [2,] 0.084433494 0.050337990 [3,] 0.069746783 0.054891438 I would like to superimpose a number of such ellipses for each level of a factor called 'site' on a ggplot2 scatterplot, produced from this command: > plat1 <- ggplot(mapping=aes(shape=site, size=geom), shape=factor(site)); plat1 + geom_point(aes(x=PC1.1,y=PC2.1)) This is run on a dataset, called dflat which looks like this: site geom PC1.1 PC2.1 PC3.1 PC1.2 PC2.2 1 Buhlen 1259.5649 -0.0387975838 -0.022889782 0.01355317 0.008705276 0.02441577 2 Buhlen 653.6607 -0.0009398704 -0.013076251 0.02898955 -0.001345149 0.03133990 The result is fine, but when I try to add the ellipse (let's say for this one site, called "Buhlen"): > plat1 + geom_point(aes(x=PC1.1,y=PC2.1)) + geom_path(data=subset(dflat, site="Buhlen"),mapping=aes(x=ELLI(PC1.1,PC2.1)[,1],y=ELLI(PC1.1,PC2.1)[,2])) I get an error message: "Error in data.frame(x = c(0.0977338099339815, 0.0844334944904515, 0.0697467834016782, : arguments imply differing number of rows: 50, 211 I've managed to fix this in the past, but I cannot remember how. It seems that geom_path is relying on the same points rather than plotting new ones. Any help would be appreciated.

    Read the article

  • Can't change pivot table's Access data source - bug in Excel 2000 SP3?

    - by Ron West
    I have a set of Excel 2000 SP3 worksheets that have Pivot Tables that get data from an Access 2000 SP3 database created by a contractor who left our company. Unfortunately, he did all his work on his private area on the company (Novell) network and now that he has left us, the drive spec has been deleted and is invalid. We were able to get the database files restored to our network area by our IT Service Desk people, but we now have to re-link everything to point to our group area instead of the now-nonexistent private area. If I follow the advice given elsewhere on this site (open wizard, click 'Back' to get to 'Step 2 of 3', click 'Get Data...' I get a message that the old filespec is an invalid path and I need to check that the path name is invalid and that I am connected to the server on which the file resides. I then click on OK and get a Login dialog with a 'Database...' button on the right. I click this and get a 'Select Database' dialog which allows me to choose the appropriate database in its correct new location. I then click OK, which takes me back to the 'Login' screen. I can confirm that it has accepted my new location by clicking on 'Database...' as before and the NEW location is still shown. So far so good - but if I then click on OK I get two unhelpful messages - first I get one saying that Excel 'Could not use '|'; file already in use.' - although no other files are in use. Clicking on OK takes me back to the 'Login' dialog. Clicking OK again gives me the same message as before telling me that the OLD filespec is invalid (as if I hadn't changed anything) - but clicking on the 'Database...' button shows that the correct (NEW) database location is still selected. Can anyone tell me a way of using VBA to change the link information without having to spend hours fighting the PivotTable Wizard - preferably similar to this way you update an Access Tabledef:- db.TableDefs(strLinkName).Connect = strNewLink db.TableDefs(strLinkName).RefreshLink Thanks!

    Read the article

< Previous Page | 462 463 464 465 466 467 468 469 470 471 472 473  | Next Page >