Search Results

Search found 67143 results on 2686 pages for 'complex data types'.

Page 30/2686 | < Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37  | Next Page >

  • SQL SERVER – Guest Posts – Feodor Georgiev – The Context of Our Database Environment – Going Beyond the Internal SQL Server Waits – Wait Type – Day 21 of 28

    - by pinaldave
    This guest post is submitted by Feodor. Feodor Georgiev is a SQL Server database specialist with extensive experience of thinking both within and outside the box. He has wide experience of different systems and solutions in the fields of architecture, scalability, performance, etc. Feodor has experience with SQL Server 2000 and later versions, and is certified in SQL Server 2008. In this article Feodor explains the server-client-server process, and concentrated on the mutual waits between client and SQL Server. This is essential in grasping the concept of waits in a ‘global’ application plan. Recently I was asked to write a blog post about the wait statistics in SQL Server and since I had been thinking about writing it for quite some time now, here it is. It is a wide-spread idea that the wait statistics in SQL Server will tell you everything about your performance. Well, almost. Or should I say – barely. The reason for this is that SQL Server is always a part of a bigger system – there are always other players in the game: whether it is a client application, web service, any other kind of data import/export process and so on. In short, the SQL Server surroundings look like this: This means that SQL Server, aside from its internal waits, also depends on external waits and settings. As we can see in the picture above, SQL Server needs to have an interface in order to communicate with the surrounding clients over the network. For this communication, SQL Server uses protocol interfaces. I will not go into detail about which protocols are best, but you can read this article. Also, review the information about the TDS (Tabular data stream). As we all know, our system is only as fast as its slowest component. This means that when we look at our environment as a whole, the SQL Server might be a victim of external pressure, no matter how well we have tuned our database server performance. Let’s dive into an example: let’s say that we have a web server, hosting a web application which is using data from our SQL Server, hosted on another server. The network card of the web server for some reason is malfunctioning (think of a hardware failure, driver failure, or just improper setup) and does not send/receive data faster than 10Mbs. On the other end, our SQL Server will not be able to send/receive data at a faster rate either. This means that the application users will notify the support team and will say: “My data is coming very slow.” Now, let’s move on to a bit more exciting example: imagine that there is a similar setup as the example above – one web server and one database server, and the application is not using any stored procedure calls, but instead for every user request the application is sending 80kb query over the network to the SQL Server. (I really thought this does not happen in real life until I saw it one day.) So, what happens in this case? To make things worse, let’s say that the 80kb query text is submitted from the application to the SQL Server at least 100 times per minute, and as often as 300 times per minute in peak times. Here is what happens: in order for this query to reach the SQL Server, it will have to be broken into a of number network packets (according to the packet size settings) – and will travel over the network. On the other side, our SQL Server network card will receive the packets, will pass them to our network layer, the packets will get assembled, and eventually SQL Server will start processing the query – parsing, allegorizing, generating the query execution plan and so on. So far, we have already had a serious network overhead by waiting for the packets to reach our Database Engine. There will certainly be some processing overhead – until the database engine deals with the 80kb query and its 20 subqueries. The waits you see in the DMVs are actually collected from the point the query reaches the SQL Server and the packets are assembled. Let’s say that our query is processed and it finally returns 15000 rows. These rows have a certain size as well, depending on the data types returned. This means that the data will have converted to packages (depending on the network size package settings) and will have to reach the application server. There will also be waits, however, this time you will be able to see a wait type in the DMVs called ASYNC_NETWORK_IO. What this wait type indicates is that the client is not consuming the data fast enough and the network buffers are filling up. Recently Pinal Dave posted a blog on Client Statistics. What Client Statistics does is captures the physical flow characteristics of the query between the client(Management Studio, in this case) and the server and back to the client. As you see in the image, there are three categories: Query Profile Statistics, Network Statistics and Time Statistics. Number of server roundtrips–a roundtrip consists of a request sent to the server and a reply from the server to the client. For example, if your query has three select statements, and they are separated by ‘GO’ command, then there will be three different roundtrips. TDS Packets sent from the client – TDS (tabular data stream) is the language which SQL Server speaks, and in order for applications to communicate with SQL Server, they need to pack the requests in TDS packets. TDS Packets sent from the client is the number of packets sent from the client; in case the request is large, then it may need more buffers, and eventually might even need more server roundtrips. TDS packets received from server –is the TDS packets sent by the server to the client during the query execution. Bytes sent from client – is the volume of the data set to our SQL Server, measured in bytes; i.e. how big of a query we have sent to the SQL Server. This is why it is best to use stored procedures, since the reusable code (which already exists as an object in the SQL Server) will only be called as a name of procedure + parameters, and this will minimize the network pressure. Bytes received from server – is the amount of data the SQL Server has sent to the client, measured in bytes. Depending on the number of rows and the datatypes involved, this number will vary. But still, think about the network load when you request data from SQL Server. Client processing time – is the amount of time spent in milliseconds between the first received response packet and the last received response packet by the client. Wait time on server replies – is the time in milliseconds between the last request packet which left the client and the first response packet which came back from the server to the client. Total execution time – is the sum of client processing time and wait time on server replies (the SQL Server internal processing time) Here is an illustration of the Client-server communication model which should help you understand the mutual waits in a client-server environment. Keep in mind that a query with a large ‘wait time on server replies’ means the server took a long time to produce the very first row. This is usual on queries that have operators that need the entire sub-query to evaluate before they proceed (for example, sort and top operators). However, a query with a very short ‘wait time on server replies’ means that the query was able to return the first row fast. However a long ‘client processing time’ does not necessarily imply the client spent a lot of time processing and the server was blocked waiting on the client. It can simply mean that the server continued to return rows from the result and this is how long it took until the very last row was returned. The bottom line is that developers and DBAs should work together and think carefully of the resource utilization in the client-server environment. From experience I can say that so far I have seen only cases when the application developers and the Database developers are on their own and do not ask questions about the other party’s world. I would recommend using the Client Statistics tool during new development to track the performance of the queries, and also to find a synchronous way of utilizing resources between the client – server – client. Here is another example: think about similar setup as above, but add another server to the game. Let’s say that we keep our media on a separate server, and together with the data from our SQL Server we need to display some images on the webpage requested by our user. No matter how simple or complicated the logic to get the images is, if the images are 500kb each our users will get the page slowly and they will still think that there is something wrong with our data. Anyway, I don’t mean to get carried away too far from SQL Server. Instead, what I would like to say is that DBAs should also be aware of ‘the big picture’. I wrote a blog post a while back on this topic, and if you are interested, you can read it here about the big picture. And finally, here are some guidelines for monitoring the network performance and improving it: Run a trace and outline all queries that return more than 1000 rows (in Profiler you can actually filter and sort the captured trace by number of returned rows). This is not a set number; it is more of a guideline. The general thought is that no application user can consume that many rows at once. Ask yourself and your fellow-developers: ‘why?’. Monitor your network counters in Perfmon: Network Interface:Output queue length, Redirector:Network errors/sec, TCPv4: Segments retransmitted/sec and so on. Make sure to establish a good friendship with your network administrator (buy them coffee, for example J ) and get into a conversation about the network settings. Have them explain to you how the network cards are setup – are they standalone, are they ‘teamed’, what are the settings – full duplex and so on. Find some time to read a bit about networking. In this short blog post I hope I have turned your attention to ‘the big picture’ and the fact that there are other factors affecting our SQL Server, aside from its internal workings. As a further reading I would still highly recommend the Wait Stats series on this blog, also I would recommend you have the coffee break conversation with your network admin as soon as possible. This guest post is written by Feodor Georgiev. Read all the post in the Wait Types and Queue series. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, Readers Contribution, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL

    Read the article

  • Oracle SQL Developer Data Modeler: What Tables Aren’t In At Least One SubView?

    - by thatjeffsmith
    Organizing your data model makes the information easier to consume. One of the organizational tools provided by Oracle SQL Developer Data Modeler is the ‘SubView.’ In a nutshell, a SubView is a subset of your model. The Challenge: I’ve just created a model which represents my entire ____________ application. We’ll call it ‘residential lending.’ Instead of having all 100+ tables in a single model diagram, I want to break out the tables by module, e.g. appraisals, credit reports, work histories, customers, etc. I’ve spent several hours breaking out the tables to one or more SubViews, but I think i may have missed a few. Is there an easy way to see what tables aren’t in at least ONE subview? The Answer Yes, mostly. The mostly comes about from the way I’m going to accomplish this task. It involves querying the SQL Developer Data Modeler Reporting Schema. So if you don’t have the Reporting Schema setup, you’ll need to do so. Got it? Good, let’s proceed. Before you start querying your Reporting Schema, you might need a data model for the actual reporting schema…meta-meta data! You could reverse engineer the data modeler reporting schema to a new data model, or you could just reference the PDFs in \datamodeler\reports\Reporting Schema diagrams directory. Here’s a hint, it’s THIS one The Query Well, it’s actually going to be at least 2 queries. We need to get a list of distinct designs stored in your repository. For giggles, I’m going to get a listing including each version of the model. So I can query based on design and version, or in this case, timestamp of when it was added to the repository. We’ll get that from the DMRS_DESIGNS table: SELECT DISTINCT design_name, design_ovid, date_published FROM DMRS_designs Then I’m going to feed the design_ovid, down to a subquery for my child report. select name, count(distinct diagram_id) from DMRS_DIAGRAM_ELEMENTS where design_ovid = :dESIGN_OVID and type = 'Table' group by name having count(distinct diagram_id) < 2 order by count(distinct diagram_id) desc Each diagram element has an entry in this table, so I need to filter on type=’Table.’ Each design has AT LEAST one diagram, the master diagram. So any relational table in this table, only having one listing means it’s not in any SubViews. If you have overloaded object names, which is VERY possible, you’ll want to do the report off of ‘OBJECT_ID’, but then you’ll need to correlate that to the NAME, as I doubt you’re so intimate with your designs that you recognize the GUIDs So I’m going to cheat and just stick with names, but I think you get the gist. My Model Of my almost 90 tables, how many of those have I not added to at least one SubView? Now let’s run my report! Voila! My ‘BEER2′ table isn’t in any SubView! It says ’1′ because the main model diagram counts as a view. So if the count came back as ’2′, that would mean the table was in the main model diagram and in 1 SubView diagram. And I know what you’re thinking, what kind of residential lending program would have a table called ‘BEER2?’ Let’s just say, that my business model has some kinks to work out!

    Read the article

  • Different types of Session state management options available with ASP.NET

    - by Aamir Hasan
    ASP.NET provides In-Process and Out-of-Process state management.In-Process stores the session in memory on the web server.This requires the a "sticky-server" (or no load-balancing) so that the user is always reconnected to the same web server.Out-of-Process Session state management stores data in an external data source.The external data source may be either a SQL Server or a State Server service.Out-of-Process state management requires that all objects stored in session are serializable.Linkhttp://msdn.microsoft.com/en-us/library/ms178586%28VS.80%29.aspx

    Read the article

  • Refreshing imported MySQL data with MySQL for Excel

    - by Javier Rivera
    Welcome to another blog post from the MySQL for Excel Team. Today we're going to talk about a new feature included since MySQL for Excel 1.3.0, you can install the latest GA or maintenance version using the MySQL Installer or optionally you can download directly any GA or non-GA version from the MySQL Developer Zone.As some users suggested in our forums we should be maintaining the link between tables and Excel not only when editing data through the Edit MySQL Data option, but also when importing data via Import MySQL Data. Before 1.3.0 this process only provided you with an offline copy of the Table's data into Excel and you had no way to refresh that information from the DB later on. Now, with this new feature we'll show you how easy is to work with the latest available information at all times. This feature is transparent to you (it doesn't require additional steps to work as long as the users had the Create an Excel Table for the imported MySQL table data option enabled. To ensure you have this option checked, click over Advanced Options... after the Import Data dialog is displayed). The current blog post assumes you already know how to import data into excel, you could always take a look at our previous post How To - Guide to Importing Data from a MySQL Database to Excel using MySQL for Excel if you need further reference on that topic. After importing Data from a MySQL Table into Excel, you can refresh the data in 3 ways.1. Simply right click over the range of the imported data, to show the pop-up menu: Click over the Refresh button to obtain the latest copy of the data in the table. 2. Click the Refresh button on the Data ribbon: 3. Click the Refresh All button in the Data ribbon (beware this will refresh all Excel tables in the Workbook): Please take a note of a couple of details here, the first one is about the size of the table. If by the time you refresh the table new columns had been added to it, and you originally have imported all columns, the table will grow to the right. The same applies to rows, if the table has new rows and you did not limit the results , the table will grow to to the bottom of the sheet in Excel. The second detail you should take into account is this operation will overwrite any changes done to the cells after the table was originally imported or previously refreshed: Now with this new feature, imported data remains linked to the data source and is available to be updated at all times. It empowers the user to always be able to work with the latest version of the imported MySQL data. We hope you like this this new feature and give it a try! Remember that your feedback is very important for us, so drop us a message with your comments, suggestions for this or other features and follow us at our social media channels: MySQL on Windows (this) Blog: https://blogs.oracle.com/MySqlOnWindows/ MySQL for Excel forum: http://forums.mysql.com/list.php?172 Facebook: http://www.facebook.com/mysql YouTube channel: https://www.youtube.com/user/MySQLChannel Thanks!

    Read the article

  • Analyzing data from same tables in diferent db instances.

    - by Oscar Reyes
    Short version: How can I map two columns from table A and B if they both have a common identifier which in turn may have two values in column C Lets say: A --- 1 , 2 B --- ? , 3 C ----- 45, 2 45, 3 Using table C I know that id 2 and 3 belong to the same item ( 45 ) and thus "?" in table B should be 1. What query could do something like that? EDIT Long version ommited. It was really boring/confusing EDIT I'm posting some output here. From this query: select distinct( rolein) , activityin from taskperformance@dm_prod where activityin in ( select activityin from activities@dm_prod where activityid in ( select activityid from activities@dm_prod where activityin in ( select distinct( activityin ) from taskperformance where rolein = 0 ) ) ) I have the following parts: select distinct( activityin ) from taskperformance where rolein = 0 Output: http://question1337216.pastebin.com/f5039557 select activityin from activities@dm_prod where activityid in ( select activityid from activities@dm_prod where activityin in ( select distinct( activityin ) from taskperformance where rolein = 0 ) ) Output: http://question1337216.pastebin.com/f6cef9393 And finally: select distinct( rolein) , activityin from taskperformance@dm_prod where activityin in ( select activityin from activities@dm_prod where activityid in ( select activityid from activities@dm_prod where activityin in ( select distinct( activityin ) from taskperformance where rolein = 0 ) ) ) Output: http://question1337216.pastebin.com/f346057bd Take for instace activityin 335 from first query ( from taskperformance B) . It is present in actvities from A. But is not in taskperformace in A ( but a the related activities: 92, 208, 335, 595 ) Are present in the result. The corresponding role in is: 1

    Read the article

  • Extract wrong data from a frame in C?

    - by ipkiss
    I am writing a program that reads the data from the serial port on Linux. The data are sent by another device with the following frame format: |start | Command | Data | CRC | End | |0x02 | 0x41 | (0-127 octets) | | 0x03| ---------------------------------------------------- The Data field contains 127 octets as shown and octet 1,2 contains one type of data; octet 3,4 contains another data. I need to get these data. Because in C, one byte can only holds one character and in the start field of the frame, it is 0x02 which means STX which is 3 characters. So, in order to test my program, On the sender side, I construct an array as the frame formatted above like: char frame[254]; frame[0] = 0x02; // starting field frame[1] = 0x41; // command field which is character 'A' ..so on.. And, then On the receiver side, I take out the fields like: char result[254]; // read data read(result); printf("command = %c", result[1]); // get the command field of the frame // get other field's values the command field value (result[1]) is not character 'A'. I think, this because the first field value of the frame is 0x02 (STX) occupying 3 first places in the array frame and leading to the wrong results on the receiver side. How can I correct the issue or am I doing something wrong at the sender side? Thanks all. related questions: http://stackoverflow.com/questions/2500567/parse-and-read-data-frame-in-c http://stackoverflow.com/questions/2531779/clear-data-at-serial-port-in-linux-in-c

    Read the article

  • How to handle value types when embedding IronPython in C#?

    - by kloffy
    There is a well known issue when it comes to using .NET value types in IronPython. This has recently caused me a headache when trying to use Python as an embedded scripting language in C#. The problem can be summed up as follows: Given a C# struct such as: struct Vector { public float x; public float y; } And a C# class such as: class Object { public Vector position; } The following will happen in IronPython: obj = Object() print obj.position.x # prints ‘0’ obj.position.x = 1 print obj.position.x # still prints ‘0’ As the article states, this means that value types are mostly immutable. However, this is a problem as I was planning on using a vector library that is implemented as seen above. Are there any workarounds for working with existing libraries that rely on value types? Modifying the library would be the very last resort, but I'd rather avoid that.

    Read the article

  • Fraud Detection with the SQL Server Suite Part 1

    - by Dejan Sarka
    While working on different fraud detection projects, I developed my own approach to the solution for this problem. In my PASS Summit 2013 session I am introducing this approach. I also wrote a whitepaper on the same topic, which was generously reviewed by my friend Matija Lah. In order to spread this knowledge faster, I am starting a series of blog posts which will at the end make the whole whitepaper. Abstract With the massive usage of credit cards and web applications for banking and payment processing, the number of fraudulent transactions is growing rapidly and on a global scale. Several fraud detection algorithms are available within a variety of different products. In this paper, we focus on using the Microsoft SQL Server suite for this purpose. In addition, we will explain our original approach to solving the problem by introducing a continuous learning procedure. Our preferred type of service is mentoring; it allows us to perform the work and consulting together with transferring the knowledge onto the customer, thus making it possible for a customer to continue to learn independently. This paper is based on practical experience with different projects covering online banking and credit card usage. Introduction A fraud is a criminal or deceptive activity with the intention of achieving financial or some other gain. Fraud can appear in multiple business areas. You can find a detailed overview of the business domains where fraud can take place in Sahin Y., & Duman E. (2011), Detecting Credit Card Fraud by Decision Trees and Support Vector Machines, Proceedings of the International MultiConference of Engineers and Computer Scientists 2011 Vol 1. Hong Kong: IMECS. Dealing with frauds includes fraud prevention and fraud detection. Fraud prevention is a proactive mechanism, which tries to disable frauds by using previous knowledge. Fraud detection is a reactive mechanism with the goal of detecting suspicious behavior when a fraudster surpasses the fraud prevention mechanism. A fraud detection mechanism checks every transaction and assigns a weight in terms of probability between 0 and 1 that represents a score for evaluating whether a transaction is fraudulent or not. A fraud detection mechanism cannot detect frauds with a probability of 100%; therefore, manual transaction checking must also be available. With fraud detection, this manual part can focus on the most suspicious transactions. This way, an unchanged number of supervisors can detect significantly more frauds than could be achieved with traditional methods of selecting which transactions to check, for example with random sampling. There are two principal data mining techniques available both in general data mining as well as in specific fraud detection techniques: supervised or directed and unsupervised or undirected. Supervised techniques or data mining models use previous knowledge. Typically, existing transactions are marked with a flag denoting whether a particular transaction is fraudulent or not. Customers at some point in time do report frauds, and the transactional system should be capable of accepting such a flag. Supervised data mining algorithms try to explain the value of this flag by using different input variables. When the patterns and rules that lead to frauds are learned through the model training process, they can be used for prediction of the fraud flag on new incoming transactions. Unsupervised techniques analyze data without prior knowledge, without the fraud flag; they try to find transactions which do not resemble other transactions, i.e. outliers. In both cases, there should be more frauds in the data set selected for checking by using the data mining knowledge compared to selecting the data set with simpler methods; this is known as the lift of a model. Typically, we compare the lift with random sampling. The supervised methods typically give a much better lift than the unsupervised ones. However, we must use the unsupervised ones when we do not have any previous knowledge. Furthermore, unsupervised methods are useful for controlling whether the supervised models are still efficient. Accuracy of the predictions drops over time. Patterns of credit card usage, for example, change over time. In addition, fraudsters continuously learn as well. Therefore, it is important to check the efficiency of the predictive models with the undirected ones. When the difference between the lift of the supervised models and the lift of the unsupervised models drops, it is time to refine the supervised models. However, the unsupervised models can become obsolete as well. It is also important to measure the overall efficiency of both, supervised and unsupervised models, over time. We can compare the number of predicted frauds with the total number of frauds that include predicted and reported occurrences. For measuring behavior across time, specific analytical databases called data warehouses (DW) and on-line analytical processing (OLAP) systems can be employed. By controlling the supervised models with unsupervised ones and by using an OLAP system or DW reports to control both, a continuous learning infrastructure can be established. There are many difficulties in developing a fraud detection system. As has already been mentioned, fraudsters continuously learn, and the patterns change. The exchange of experiences and ideas can be very limited due to privacy concerns. In addition, both data sets and results might be censored, as the companies generally do not want to publically expose actual fraudulent behaviors. Therefore it can be quite difficult if not impossible to cross-evaluate the models using data from different companies and different business areas. This fact stresses the importance of continuous learning even more. Finally, the number of frauds in the total number of transactions is small, typically much less than 1% of transactions is fraudulent. Some predictive data mining algorithms do not give good results when the target state is represented with a very low frequency. Data preparation techniques like oversampling and undersampling can help overcome the shortcomings of many algorithms. SQL Server suite includes all of the software required to create, deploy any maintain a fraud detection infrastructure. The Database Engine is the relational database management system (RDBMS), which supports all activity needed for data preparation and for data warehouses. SQL Server Analysis Services (SSAS) supports OLAP and data mining (in version 2012, you need to install SSAS in multidimensional and data mining mode; this was the only mode in previous versions of SSAS, while SSAS 2012 also supports the tabular mode, which does not include data mining). Additional products from the suite can be useful as well. SQL Server Integration Services (SSIS) is a tool for developing extract transform–load (ETL) applications. SSIS is typically used for loading a DW, and in addition, it can use SSAS data mining models for building intelligent data flows. SQL Server Reporting Services (SSRS) is useful for presenting the results in a variety of reports. Data Quality Services (DQS) mitigate the occasional data cleansing process by maintaining a knowledge base. Master Data Services is an application that helps companies maintaining a central, authoritative source of their master data, i.e. the most important data to any organization. For an overview of the SQL Server business intelligence (BI) part of the suite that includes Database Engine, SSAS and SSRS, please refer to Veerman E., Lachev T., & Sarka D. (2009). MCTS Self-Paced Training Kit (Exam 70-448): Microsoft® SQL Server® 2008 Business Intelligence Development and Maintenance. MS Press. For an overview of the enterprise information management (EIM) part that includes SSIS, DQS and MDS, please refer to Sarka D., Lah M., & Jerkic G. (2012). Training Kit (Exam 70-463): Implementing a Data Warehouse with Microsoft® SQL Server® 2012. O'Reilly. For details about SSAS data mining, please refer to MacLennan J., Tang Z., & Crivat B. (2009). Data Mining with Microsoft SQL Server 2008. Wiley. SQL Server Data Mining Add-ins for Office, a free download for Office versions 2007, 2010 and 2013, bring the power of data mining to Excel, enabling advanced analytics in Excel. Together with PowerPivot for Excel, which is also freely downloadable and can be used in Excel 2010, is already included in Excel 2013. It brings OLAP functionalities directly into Excel, making it possible for an advanced analyst to build a complete learning infrastructure using a familiar tool. This way, many more people, including employees in subsidiaries, can contribute to the learning process by examining local transactions and quickly identifying new patterns.

    Read the article

  • Oracle Data Integrator at Oracle OpenWorld 2012: Demonstrations

    - by Irem Radzik
    By Mike Eisterer Oracle OpenWorld is just a few days away and  we look forward to showing Oracle Data Integrator' comprehensive data integration platform, which delivers critical data integration requirements: from high-volume, high-performance batch loads, to event-driven, trickle-feed integration processes, to SOA-enabled data services.  Several Oracle Data Integrator demonstrations will be available October 1st through the3rd : Oracle Data Integrator and Oracle GoldenGate for Oracle Applications, in Moscone South, Right - S-240 Oracle Data Integrator and Service Integration, in Moscone South, Right - S-235 Oracle Data Integrator for Big Data, in Moscone South, Right - S-236 Oracle Data Integrator for Enterprise Data Warehousing, in Moscone South, Right - S-238 Additional information about OOW 2012 may be found for the following demonstrations. If you are not able to attend OpenWorld, please check out our latest resources for Data Integration.  

    Read the article

  • Which algorithms/data structures should I "recognize" and know by name?

    - by Earlz
    I'd like to consider myself a fairly experienced programmer. I've been programming for over 5 years now. My weak point though is terminology. I'm self-taught, so while I know how to program, I don't know some of the more formal aspects of computer science. So, what are practical algorithms/data structures that I could recognize and know by name? Note, I'm not asking for a book recommendation about implementing algorithms. I don't care about implementing them, I just want to be able to recognize when an algorithm/data structure would be a good solution to a problem. I'm asking more for a list of algorithms/data structures that I should "recognize". For instance, I know the solution to a problem like this: You manage a set of lockers labeled 0-999. People come to you to rent the locker and then come back to return the locker key. How would you build a piece of software to manage knowing which lockers are free and which are in used? The solution, would be a queue or stack. What I'm looking for are things like "in what situation should a B-Tree be used -- What search algorithm should be used here" etc. And maybe a quick introduction of how the more complex(but commonly used) data structures/algorithms work. I tried looking at Wikipedia's list of data structures and algorithms but I think that's a bit overkill. So I'm looking more for what are the essential things I should recognize?

    Read the article

  • Understanding Data Science: Recent Studies

    - by Joe Lamantia
    If you need such a deeper understanding of data science than Drew Conway's popular venn diagram model, or Josh Wills' tongue in cheek characterization, "Data Scientist (n.): Person who is better at statistics than any software engineer and better at software engineering than any statistician." two relatively recent studies are worth reading.   'Analyzing the Analyzers,' an O'Reilly e-book by Harlan Harris, Sean Patrick Murphy, and Marck Vaisman, suggests four distinct types of data scientists -- effectively personas, in a design sense -- based on analysis of self-identified skills among practitioners.  The scenario format dramatizes the different personas, making what could be a dry statistical readout of survey data more engaging.  The survey-only nature of the data,  the restriction of scope to just skills, and the suggested models of skill-profiles makes this feel like the sort of exercise that data scientists undertake as an every day task; collecting data, analyzing it using a mix of statistical techniques, and sharing the model that emerges from the data mining exercise.  That's not an indictment, simply an observation about the consistent feel of the effort as a product of data scientists, about data science.  And the paper 'Enterprise Data Analysis and Visualization: An Interview Study' by researchers Sean Kandel, Andreas Paepcke, Joseph Hellerstein, and Jeffery Heer considers data science within the larger context of industrial data analysis, examining analytical workflows, skills, and the challenges common to enterprise analysis efforts, and identifying three archetypes of data scientist.  As an interview-based study, the data the researchers collected is richer, and there's correspondingly greater depth in the synthesis.  The scope of the study included a broader set of roles than data scientist (enterprise analysts) and involved questions of workflow and organizational context for analytical efforts in general.  I'd suggest this is useful as a primer on analytical work and workers in enterprise settings for those who need a baseline understanding; it also offers some genuinely interesting nuggets for those already familiar with discovery work. We've undertaken a considerable amount of research into discovery, analytical work/ers, and data science over the past three years -- part of our programmatic approach to laying a foundation for product strategy and highlighting innovation opportunities -- and both studies complement and confirm much of the direct research into data science that we conducted. There were a few important differences in our findings, which I'll share and discuss in upcoming posts.

    Read the article

  • IOMEGA 500GB hard disk data reccovery

    - by Vineeth
    Last year by November I bought an IOMEGA 500GB Prestige hard disk. Yesterday, unfortunately the hard disk fell down from my table. After that incident, when I connect my disk, Windows asks me to format the disk to use, but I didn't format it yet. Actually, on that hard disk I have about 320GB of data. I tried all my possible ways to access my disk. I tried using DOS. It shows "data error (Cyclic redundancy check)". I have a 3 year warranty. Will I be covered under warranty if I report this issue to IOMEGA? Can I get my data back?

    Read the article

  • Which statically typed languages support intersection types for function return values?

    - by stakx
    Initial note: This question got closed after several edits because I lacked the proper terminology to state accurately what I was looking for. Sam Tobin-Hochstadt then posted a comment which made me recognise exactly what that was: programming languages that support intersection types for function return values. Now that the question has been re-opened, I've decided to improve it by rewriting it in a (hopefully) more precise manner. Therefore, some answers and comments below might no longer make sense because they refer to previous edits. (Please see the question's edit history in such cases.) Are there any popular statically & strongly typed programming languages (such as Haskell, generic Java, C#, F#, etc.) that support intersection types for function return values? If so, which, and how? (If I'm honest, I would really love to see someone demonstrate a way how to express intersection types in a mainstream language such as C# or Java.) I'll give a quick example of what intersection types might look like, using some pseudocode similar to C#: interface IX { … } interface IY { … } interface IB { … } class A : IX, IY { … } class B : IX, IY, IB { … } T fn() where T : IX, IY { return … ? new A() : new B(); } That is, the function fn returns an instance of some type T, of which the caller knows only that it implements interfaces IX and IY. (That is, unlike with generics, the caller doesn't get to choose the concrete type of T — the function does. From this I would suppose that T is in fact not a universal type, but an existential type.) P.S.: I'm aware that one could simply define a interface IXY : IX, IY and change the return type of fn to IXY. However, that is not really the same thing, because often you cannot bolt on an additional interface IXY to a previously defined type A which only implements IX and IY separately. Footnote: Some resources about intersection types: Wikipedia article for "Type system" has a subsection about intersection types. Report by Benjamin C. Pierce (1991), "Programming With Intersection Types, Union Types, and Polymorphism" David P. Cunningham (2005), "Intersection types in practice", which contains a case study about the Forsythe language, which is mentioned in the Wikipedia article. A Stack Overflow question, "Union types and intersection types" which got several good answers, among them this one which gives a pseudocode example of intersection types similar to mine above.

    Read the article

  • how to find a good data center?

    - by drewda
    At my start-up, we're getting to the point where we should be hosting our servers at a data center. I'd appreciate any tips and tricks y'all can offer on finding a reputable place to colocate our racks. Are there any Web sites with customer reviews of data centers or should I just be asking around at techie events? Are unlimited bandwidth plans a gimmick or becoming the norm? Is it worth establishing a redundant set of machines at a second data center from Day One? Or just do offsite back-ups? Thanks for your suggestions.

    Read the article

  • Appending column to a data frame - R

    - by darkie15
    Is it possible to append a column to data frame in the following scenario? dfWithData <- data.frame(start=c(1,2,3), end=c(11,22,33)) dfBlank <- data.frame() ..how to append column start from dfWithData to dfBlank? It looks like the data should be added when data frame is being initialized. I can do this: dfBlank <- data.frame(dfWithData[1]) but I am more interested if it is possible to append columns to an empty (but inti)

    Read the article

  • Protecting Consolidated Data on Engineered Systems

    - by Steve Enevold
    In this time of reduced budgets and cost cutting measures in Federal, State and Local governments, the requirement to provide services continues to grow. Many agencies are looking at consolidating their infrastructure to reduce cost and meet budget goals. Oracle's engineered systems are ideal platforms for accomplishing these goals. These systems provide unparalleled performance that is ideal for running applications and databases that traditionally run on separate dedicated environments. However, putting multiple critical applications and databases in a single architecture makes security more critical. You are putting a concentrated set of sensitive data on a single system, making it a more tempting target.  The environments were previously separated by iron so now you need to provide assurance that one group, department, or application's information is not visible to other personnel or applications resident in the Exadata system. Administration of the environments requires formal separation of duties so an administrator of one application environment cannot view or negatively impact others. Also, these systems need to be in protected environments just like other critical production servers. They should be in a data center protected by physical controls, network firewalls, intrusion detection and prevention, etc Exadata also provides unique security benefits, including a reducing attack surface by minimizing packages and services to only those required. In addition to reducing the possible system areas someone may attempt to infiltrate, Exadata has the following features: 1.    Infiniband, which functions as a secure private backplane 2.    IPTables  to perform stateful packet inspection for all nodes               Cellwall implements firewall services on each cell using IPTables 3.    Hardware accelerated encryption for data at rest on storage cells Oracle is uniquely positioned to provide the security necessary for implementing Exadata because security has been a core focus since the company's beginning. In addition to the security capabilities inherent in Exadata, Oracle security products are all certified to run in an Exadata environment. Database Vault Oracle Database Vault helps organizations increase the security of existing applications and address regulatory mandates that call for separation-of-duties, least privilege and other preventive controls to ensure data integrity and data privacy. Oracle Database Vault proactively protects application data stored in the Oracle database from being accessed by privileged database users. A unique feature of Database Vault is the ability to segregate administrative tasks including when a command can be executed, or that the DBA can manage the health of the database and objects, but may not see the data Advanced Security  helps organizations comply with privacy and regulatory mandates by transparently encrypting all application data or specific sensitive columns, such as credit cards, social security numbers, or personally identifiable information (PII). By encrypting data at rest and whenever it leaves the database over the network or via backups, Oracle Advanced Security provides the most cost-effective solution for comprehensive data protection. Label Security  is a powerful and easy-to-use tool for classifying data and mediating access to data based on its classification. Designed to meet public-sector requirements for multi-level security and mandatory access control, Oracle Label Security provides a flexible framework that both government and commercial entities worldwide can use to manage access to data on a "need to know" basis in order to protect data privacy and achieve regulatory compliance  Data Masking reduces the threat of someone in the development org taking data that has been copied from production to the development environment for testing, upgrades, etc by irreversibly replacing the original sensitive data with fictitious data so that production data can be shared safely with IT developers or offshore business partners  Audit Vault and Database Firewall Oracle Audit Vault and Database Firewall serves as a critical detective and preventive control across multiple operating systems and database platforms to protect against the abuse of legitimate access to databases responsible for almost all data breaches and cyber attacks.  Consolidation, cost-savings, and performance can now be achieved without sacrificing security. The combination of built in protection and Oracle’s industry-leading data protection solutions make Exadata an ideal platform for Federal, State, and local governments and agencies.

    Read the article

  • using core data with web services

    - by Jayshree
    Hi. i am a noob in xcode. I am developing an iphone app where i need to send and receive data from a web service. And i need to store them temporarily in my app. i dont want to use sqlite. so i was wondering if i should use core data for this purpose. I read some articles but i still dont have a clear picture of how to do it, coz i have used core data only with sqlite. I want to do the following things : Will receive table data from a web service. Have to perform certain calculations on those fields. Will send the data back in xml format to the server. How do i convert the xml data into int, date or any other data type? and how do i store it in managed data objects? Can anyone please help me with this??? thnx for your time.

    Read the article

  • Presenting Loading Data Warehouse Partitions with SSIS 2012 at SQL Saturday DC!

    - by andyleonard
    Join Darryll Petrancuri and me as we present Loading Data Warehouse Partitions with SSIS 2012 Saturday 8 Dec 2012 at SQL Saturday 173 in DC ! SQL Server 2012 table partitions offer powerful Big Data solutions to the Data Warehouse ETL Developer. In this presentation, Darryll Petrancuri and Andy Leonard demonstrate one approach to loading partitioned tables and managing the partitions using SSIS 2012, and reporting partition metrics using SSRS 2012. Objectives A practical solution for loading Big...(read more)

    Read the article

  • Why CFOs Should Care About Big Data

    - by jmorourke
    The topic of “big data” clearly has reached a tipping point in 2012.  With plenty of coverage over the past few years in the IT press, we are now starting to see the topic of “big data” covered in mainstream business press, including a cover story in the October 2012 issue of the Harvard Business Review.  To help customers understand the challenges of managing “big data” as well as the opportunities that can be created by leveraging “big data”, Oracle has recently run and published the results of a customer survey, as well as white papers and articles on this topic.  Most recently, we commissioned a white paper titled “Mastering Big Data: CFO Strategies to Transform Insight into Opportunity”. The premise here is that “big data” is not just a topic that CIOs should pay attention to, but one that CFOs should understand and take advantage of as well.  Clearly, whoever masters the art and science of big data will be positioned for competitive advantage in their industries or markets.  That’s why smart CFOs are taking control of big data and business analytics projects, not just to uncover new ways to drive growth in a slowing global economy, but also to be a catalyst for change in the enterprise.  With an increasing number of CFOs now responsible for overseeing IT investments and providing strategic insight to the board, CFOs will be increasingly called upon to take a leadership role in assessing the value of “big data” initiatives, building on their traditional skills in reporting and helping managers analyze data to support decision making. Here’s a link to the white paper referenced above, which is posted on the Oracle C-Central/CFO web site, as well as some other resources that can help CFOs master the topic of “big data”: White Paper “Mastering Big Data:  CFO Strategies to Transform Insight into Opportunity CFO Market Watch article:  “Does Big Data Affect the CFO?” Oracle Survey Report:  “From Overload to Impact – An Industry Scorecard on Big Data Industry Challenges” Upcoming Big Data Webcast with Andrew McAfee Here’s a general link to Oracle C-Central/CFO in case you want to start there: www.oracle.com/c-central/cfo Feel free to contact me if you have any questions or need additional information:  [email protected]

    Read the article

  • data recovery from unallocated harddisk partition

    - by user36007
    Hi, I accidentally deleted a partition which mainly served as space I put my data, labeled D: drive. The partition wasn't subsequently formatted though, following the delete incident. Obviously the D: drive doesn't show up as it usually does when I run Windows 7. In the "Computer Management", on clicking the Disk Management I clearly see the space is now labled as unallocated. question: How do I go about recovering my data. Perhaps what the effective data recovery software I can use to resolve this issue. Thanks

    Read the article

  • data recovery from unallocated harddisk partition

    - by user36007
    Hi, I accidentally deleted a partition which mainly served as space I put my data, labeled D: drive. The partition wasn't subsequently formatted though, following the delete incident. Obviously the D: drive doesn't show up as it usually does when I run Windows 7. In the "Computer Management", on clicking the Disk Management I clearly see the space is now labled as unallocated. question: How do I go about recovering my data. Perhaps what the effective data recovery software I can use to resolve this issue. Thanks

    Read the article

< Previous Page | 26 27 28 29 30 31 32 33 34 35 36 37  | Next Page >