Search Results

Search found 7116 results on 285 pages for 'nested queries'.

Page 82/285 | < Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89 | Next Page >

Technology Selection for a dynamic product

- by Kuntal Shah

We are building a product for Procurement Domain in JAVA. Following are the main technical requirements. Platform Independent Database Independent Browser Independent In functional requirements the product is very dynamic in nature. The main reason being the procurement process around the world is different from client to client. Briefly we need to have a dynamic workflow engine and a dynamic template engine. The workflow engine by which we can define any kind of workflows and the template engine allows us to define any kind of data structures and based on definition it can get the user input through workflow. We have been developing this product for almost 2 years. It has been a long time till we can get down with the dynamics of requirements. Till now we have developed a basic workflow and template engine and which is in use at one of the client. We have been using following technologies. GWT-Ext (Front End Framework) Hibernate (Database Layer) In between we have faced some issues with GWT-Ext (mainly browser compatibility) and database optimization due to sub classing in hibernate. For resolving GWT-Ext issue, which a dying community so we decided to move to SmartGWT. In SmartGWT we faced issues related to loading and now we are able to finalize that GWT 2.3 will be the way to go as the library is rich and performance is upto the mark. We are able to almost finalize GWT-Spring based front and middle layer. In hibernate, we found main issues with sub-classing due to that it was throwing astronomical queries and sometimes it would stop firing any queries for 5-10 seconds or may be around 30 seconds and then resume again. Few days back I came to one article related to ORM. I am a traditional .Net SQL developer and I have always worked with relational database. Reading through this article, I also found it relating to the issues I face. I am still not completely convinced of using hibernate and this article just supported my opinion. Following are the questions for which I am looking for an answer. Should we be going with Hibernate in case of dynamic database requirements and the load of the data will be heavy in future? How can we partition the data, how we can efficiently join the data, how we can optimize the queries? If the answer is no then how do we achieve database independence? Is our choice related to GWT and Spring proper or do we need to change that too? Should we use any other key value pair database if the data is dynamic in nature and it is very difficult to make it relational?

Read the article
Idea to develop a caching server between IIS and SQL Server

- by John

I work on a few high traffic websites that all share the same database and that are all heavily database driven. Our SQL server is max-ed out and, although we have already implemented many changes that have helped but the server is still working too hard. We employ some caching in our website but the type of queries we use negate using SQL dependency caching. We tried SQL replication to try and kind of load balance but that didn't prove very successful because the replication process is quite demanding on the servers too and it needed to be done frequently as it is important that data is up to date. We do use a Varnish web caching server (Linux based) to take a bit of the load off both the web and database server but as a lot of the sites are customised based on the user we can only do so much. Anyway, the reason for this question... Varnish gave me an idea for a possible application that might help in this situation. Just like Varnish sits between a web browser and the web server and caches response from the web server, I was wondering about the possibility of creating something that sits between the web server and the database server. Imagine that all SQL queries go through this SQL caching server. If it's a first time query then it will get recorded, and the result requested from the SQL server and stored locally on the cache server. If it's a repeat request within a set time then the result gets retrieved from the local copy without the query being sent to the SQL server. The caching server could also take advantage of SQL dependency caching notifications. This seems like a good idea in theory. There's still the same amount of data moving back and forward from the web server, but the SQL server is relieved of the work of processing the repeat queries. I wonder about how difficult it would be to build a service that sort of emulates requests and responses from SQL server, whether SQL server's own caching is doing enough of this already that this wouldn't be a benefit, or even if someone has done this before and I haven't found it? I would welcome any feedback or any references to any relevant projects.

Read the article
Entity Framework 4.0: Optimal and horrible SQL

- by DigiMortal

Lately I had Entity Framework 4.0 session where I introduced new features of Entity Framework. During session I found out with audience how Entity Framework 4.0 can generate optimized SQL. After session I also showed guys one horrible example about how awful SQL can be generated by Entity Framework. In this posting I will cover both examples. Optimal SQL Before going to code take a look at following model. There is class called Event and I will use this class in my query. Here is the LINQ To Entities query that uses small anonymous type. var query = from e in _context.Events select new { Id = e.Id, Title = e.Title }; Debug.WriteLine(((ObjectQuery)query).ToTraceString()); Running this code gives us the following SQL. SELECT [Extent1].[event_id] AS [event_id], [Extent1].[title] AS [title] FROM [dbo].[events] AS [Extent1] This is really small – no additional fields in SELECT clause. Nice, isn’t it? Horrible SQL Ayende Rahien blog shows us darker side of Entiry Framework 4.0 queries. You can find comparison betwenn NHibernate, LINQ To SQL and LINQ To Entities from posting What happens behind the scenes: NHibernate, Linq to SQL, Entity Framework scenario analysis. In this posting I will show you the resulting query and let you think how much better it can be done. Well, it is not something we want to see running in our servers. I hope that EF team improves generated SQL to acceptable level before Visual Studio 2010 is released. There is also morale of this example: you should always check out the queries that O/R-mapper generates. Behind the curtains it may silently generate queries that perform badly and in this case you need to optimize you data querying strategy. Conclusion Entity Framework 4.0 is new product with a lot of new features and it is clear that not everything is 100% super in its first release. But it still great step forward and I hope that on 12.04.2010 we have new promising O/R-mapper available to use in our projects. If you want to read more about Entity Framework 4.0 and Visual Studio 2010 then please feel free to follow this link to list of my Visual Studio 2010 and .NET Framework 4.0 postings.

Read the article
Advanced Search Stored procedure

- by Ray Eatmon

So I am working on an MVC ASP.NET web application which centers around lots of data and data manipulation. PROBLEM OVERVIEW: We have an advanced search with 25 different filter criteria. I am using a stored procedure for this search. The stored procedure takes in parameters, filter for specific objects, and calculates return data from those objects. It queries large tables 14 millions records on some table, filtering and temp tables helped alleviate some of the bottle necks for those queries. ISSUE: The stored procedure used to take 1 min to run, which creates a timeout returning 0 results to the browser. I rewrote the procedure and got it down to 21 secs so the timeout does not occur. This ONLY occurs this slow the FIRST time the search is run, after that it takes like 5 secs. I am wondering should I take a different approach to this problem, should I worry about this type of performance issue if it does not timeout?

Read the article
Tuning Red Gate: #4 of Some

- by Grant Fritchey

First time connecting to these servers directly (keys to the kingdom, bwa-ha-ha-ha. oh, excuse me), so I'm going to take a look at the server properties, just to see if there are any issues there. Max memory is set, cool, first possible silly mistake clear. In fact, these look to be nicely set up. Oh, I'd like to see the ANSI Standards set by default, but it's not a big deal. The default location for database data is the F:\ drive, where I saw all the activity last time. Cool, the people maintaining the servers in our company listen, parallelism threshold is set to 35 and optimize for ad hoc is enabled. No shocks, no surprises. The basic setup is appropriate. On to the problem database. Nothing wrong in the properties. The database is in SIMPLE recovery, but I think it's a reporting system, so no worries there. Again, I'd prefer to see the ANSI settings for connections, but that's the worst thing I can see. Time to look at the queries, tables, indexes and statistics because all the information I've collected over the last several days suggests that we're not looking at a systemic problem (except possibly not enough memory), but at the traditional tuning issues. I just want to note that, I started looking at the system, not the queries. So should you when tuning your environment. I know, from the data collected through SQL Monitor, what my top poor performing queries are, and the most frequently called, etc. I'm starting with the most frequently called. I'm going to get the execution plan for this thing out of the cache (although, with the cache dumping constantly, I might not get it). And it's not there. Called 1.3 million times over the last 3 days, but it's not in cache. Wow. OK. I'll see what's in cache for this database: SELECT deqs.creation_time, deqs.execution_count, deqs.max_logical_reads, deqs.max_elapsed_time, deqs.total_logical_reads, deqs.total_elapsed_time, deqp.query_plan, SUBSTRING(dest.text, (deqs.statement_start_offset / 2) + 1, (deqs.statement_end_offset - deqs.statement_start_offset) / 2 + 1) AS QueryStatement FROM sys.dm_exec_query_stats AS deqs CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest CROSS APPLY sys.dm_exec_query_plan(deqs.plan_handle) AS deqp WHERE dest.dbid = DB_ID('Warehouse') AND deqs.statement_end_offset > 0 AND deqs.statement_start_offset > 0 ORDER BY deqs.max_logical_reads DESC ; And looking at the most expensive operation, we have our first bad boy: Multiple table scans against very large sets of data and a sort operation. a sort operation? It's an insert. Oh, I see, the table is a heap, so it's doing an insert, then sorting the data and then inserting into the primary key. First question, why isn't this a clustered index? Let's look at some more of the queries. The next one is deceiving. Here's the query plan: You're thinking to yourself, what's the big deal? Well, what if I told you that this thing had 8036318 reads? I know, you're looking at skinny little pipes. Know why? Table variable. Estimated number of rows = 1. Actual number of rows. well, I'm betting several more than one considering it's read 8 MILLION pages off the disk in a single execution. We have a serious and real tuning candidate. Oh, and I missed this, it's loading the table variable from a user defined function. Let me check, let me check. YES! A multi-statement table valued user defined function. And another tuning opportunity. This one's a beauty, seriously. Did I also mention that they're doing a hash against all the columns in the physical table. I'm sure that won't lead to scans of a 500,000 row table, no, not at all. OK. I lied. Of course it is. At least it's on the top part of the Loop which means the scan is only executed once. I just did a cursory check on the next several poor performers. all calling the UDF. I think I found a big tuning opportunity. At this point, I'm typing up internal emails for the company. Someone just had their baby called ugly. In addition to a series of suggested changes that we need to implement, I'm also apologizing for being such an unkind monster as to question whether that third eye & those flippers belong on such an otherwise lovely child.

Read the article
Modifying Service URLs with LINQ to Twitter

- by Joe Mayo

It’s funny that two posts so close together speak about flexibility with the LINQ to Twitter provider. There are certain things you know from experience on when to make software more flexible and when to save time. This is another one of those times when I got lucky and made the right choice up front. I’m talking about the ability to switch URLs. It only makes sense that Twitter should begin versioning their API as it matures. In fact, most of the entire API has moved to the v1 URL at “https://api.twitter.com/1/”, except for search and trends. Recently, Twitter introduced the available and local trends, but hung them off the new v1, and left the rest of the trends API on the old URL. To implement this, I muscled my way into the expression tree during CreateRequestProcessor to figure out which trend I was dealing with; perhaps not elegant, but the code is in the right place and that’s what factories are for. Anyway, the point is that I wouldn’t have to do this kind of stuff (as much fun as it is), if Twitter would have more consistency. Having went to Chirp last week and seeing the evolution of the API, it looks like my wish is coming true. …now if they would just get their stuff together on the mess they made with geo-location and places… but again, that’s all transparent if your using LINQ to Twitter because I pulled all of that together in a consistent way so that you don’t have to. Normally, when Twitter makes a change, code breaks and I have to scramble to get the fixes in-place. This time, in the case of a URL change, the adjustment is easy and no-one has to wait for me. Essentially, all you need to do is change the URL passed to the TwitterContext constructor. Here’s an example of instantiating a TwitterContext now: using (var twitterCtx = new TwitterContext(auth, "https://api.twitter.com/1/", "https://search.twitter.com/")) The third parameter constructor is the SearchUrl, which is used for Search and Trend APIs. You probably know what’s coming next; another constructor, but with the SearchUrl parameter set to the new URL as follows: using (var twitterCtx = new TwitterContext(auth, "https://api.twitter.com/1/", "https://api.twitter.com/1/")) One consequence of setting the URL this way is that you set the URL for both Trends and Search. Since Search is still using the old URL, this is going to break for Search queries. You could always instantiate a special TwitterContext instance for Search queries, with the old URL set. Alternatively, you can use the TwitterContext’s SearchUrl property. Here’s an example: twitterCtx.SearchUrl = "https://api.twitter.com/1/"; var trends = (from trend in twitterCtx.Trends where trend.Type == TrendType.Daily && trend.Date == DateTime.Now.AddDays(-2).Date select trend) .ToList(); Notice how I set the SearchUrl property just-in-time for the query. This allows you to target the URL for each specific query. Whichever way you prefer to configure the URL, it’s your choice. So, now you know how to set the URL to be used for Trend queries and how to prevent whacking your Search queries. I’ll be updating the Trend API to use same URL as all other APIs soon, so the only API left to use the SearchUrl will be Search, but for the short term, it’s Trends and Search. Until I make this change, you’ll have a viable work-around by setting the URL yourself, as explained above. These were the Search and Trend URLs, but you might be curious about the second parameter of the TwitterContext constructor; that’s the URL for all other APIs (the BaseUrl), except for Trend and Search. Similarly, you can use the TwitterContext’s BaseUrl property to set the BaseUrl. Setting the BaseUrl can be useful when communicating with other services. In addition to Twitter changing URLs, the Twitter API has been adopted by other companies, such as Identi.ca, Tumblr, and WordPress. This capability lets you use LINQ to Twitter with any of these services. This is a testament to the success of the Twitter API and it’s popularity. No doubt we’ll have hills and valleys to traverse as the Twitter API matures, but hopefully there will be enough flexibility in LINQ to Twitter to make these changes as transparent as possible for you. @JoeMayo

Read the article
SSAS: Utility to check you have the correct data types and sizes in your cube definition

- by DrJohn

This blog describes a tool I developed which allows you to compare the data types and data sizes found in the cube’s data source view with the data types/sizes of the corresponding dimensional attribute. Why is this important? Well when creating named queries in a cube’s data source view, it is often necessary to use the SQL CAST or CONVERT operation to change the data type to something more appropriate for SSAS. This is particularly important when your cube is based on an Oracle data source or using custom SQL queries rather than views in the relational database. The problem with BIDS is that if you change the underlying SQL query, then the size of the data type in the dimension does not update automatically. This then causes problems during deployment whereby processing the dimension fails because the data in the relational database is wider than that allowed by the dimensional attribute. In particular, if you use some string manipulation functions provided by SQL Server or Oracle in your queries, you may find that the 10 character string you expect suddenly turns into an 8,000 character monster. For example, the SQL Server function REPLACE returns column with a width of 8,000 characters. So if you use this function in the named query in your DSV, you will get a column width of 8,000 characters. Although the Oracle REPLACE function is far more intelligent, the generated column size could still be way bigger than the maximum length of the data actually in the field. Now this may not be a problem when prototyping, but in your production cubes you really should clean up this kind of thing as these massive strings will add to processing times and storage space. Similarly, you do not want to forget to change the size of the dimension attribute if your database columns increase in size. Introducing CheckCubeDataTypes Utiltity The CheckCubeDataTypes application extracts all the data types and data sizes for all attributes in the cube and compares them to the data types and data sizes in the cube’s data source view. It then generates an Excel CSV file which contains all this metadata along with a flag indicating if there is a mismatch between the DSV and the dimensional attribute. Note that the app not only checks all the attribute keys but also the name and value columns for each attribute. Another benefit of having the metadata held in a CSV text file format is that you can place the file under source code control. This allows you to compare the metadata of the previous cube release with your new release to highlight problems introduced by new development. You can download the C# source code from here: CheckCubeDataTypes.zip A typical example of the output Excel CSV file is shown below - note that the last column shows a data size mismatch by TRUE appearing in the column

Read the article
I'm creating my own scalable, rapid prototyping web server. How should I design it?

- by Mike Willliams

I'm going to create my own web server that focuses on scalability, rapid prototyping and the use of JavaScript as the server's scripting language, much like node.js. It will use a Model-View-Controller design pattern so a web application can support more concurrent users just by adding hardware -- and not having to redesign the software. Basically, I'm aiming to produce a framework that allows for fast and easy development of cloud applications without the need to write lots of boiler plate code. I've got some questions about this... How hard will it be to put MySQL in the cloud? How could I go about implementing this and make the resulting product free? Will I have to write my own engine or modify an existing one, if I do what should I watch out for? To make this scalable I need to adjust from one server to hundreds of servers this creates the requirement for the servers to be load balancing, how should I do this? If I balance based on the work load per server I would need gateway to handle all the incoming requests. Is it the right idea to have all the servers check into the gateway and update there status. By having the servers run through a gateway if the gateway dies all the incoming requests are ignored. I'm thinking that having all the servers maintain a list of each other, or at least a few I could rebuild the list of servers and establish a new gateway. Is it worth it? Or should I have a backup gateway that could switch out? Should I let the user choose? How should I pick which server handles the database and which handles the page serving? Should I spread the database so that queries are preformed on multiple servers? Which would theoretically improve performance. The servers would need to mirror the database at least once so that if a server goes down the database isn't corrupted. So this brings up writing another question, should I broadcast SQL queries so that all the servers can take a bit of the work load? If I do it that way wouldn't a query clog up the network so that other queries couldn't be preformed? What are my alternatives? Finally, is there a free solution already out there that might need a little modification that suits my needs?

Read the article
Sub routing in a SPA site

- by Anders

I have a SPA site that I'm working on, I have a requirement that you can have subroutes for a page view model. Im currently using this 'pattern' for the site MyApp.FooViewModel = MyApp.define({ meta: { query: MyApp.Core.Contracts.Queries.FooQuery, title: "Foo" }, init: function (queryResult) { }, prototype: { } }); In the master view model I have a route table this.navigation(new MyApp.RoutesViewModel({ Home: { model: MyApp.HomeViewModel, route: String.empty }, Foo: { model: MyApp.FooViewModel } })); The meta object defines which query should populate the top level view model when its invoked through sammyjs, this is all fine but it does not support sub routing My plan is to change the meta object so that it can (optional offcourse) look like this meta: { query: MyApp.Core.Contracts.Queries.FooQuery, title: "Foo", route: { barId: MyApp.BarViewModel } } When sammyjs detects a barId in the query string the Barmodel will be executed and populated through its own meta object. Is this a good design?

Read the article
How to reduce Entity Framework 4 query compile time?

- by Rup

Summary: We're having problems with EF4 query compilation times of 12+ seconds. Cached queries will only get us so far; are there any ways we can actually reduce the compilation time? Is there anything we might be doing wrong we can look for? Thanks! We have an EF4 model which is exposed over the WCF services. For each of our entity types we expose a method to fetch and return the whole entity for display / edit including a number of referenced child objects. For one particular entity we have to .Include() 31 tables / sub-tables to return all relevant data. Unfortunately this makes the EF query compilation prohibitively slow: it takes 12-15 seconds to compile and builds a 7,800-line, 300K query. This is the back-end of a web UI which will need to be snappier than that. Is there anything we can do to improve this? We can CompiledQuery.Compile this - that doesn't do any work until first use and so helps the second and subsequent executions but our customer is nervous that the first usage shouldn't be slow either. Similarly if the IIS app pool hosting the web service gets recycled we'll lose the cached plan, although we can increase lifetimes to minimise this. Also I can't see a way to precompile this ahead of time and / or to serialise out the EF compiled query cache (short of reflection tricks). The CompiledQuery object only contains a GUID reference into the cache so it's the cache we really care about. (Writing this out it occurs to me I can kick off something in the background from app_startup to execute all queries to get them compiled - is that safe?) However even if we do solve that problem, we build up our search queries dynamically with LINQ-to-Entities clauses based on which parameters we're searching on: I don't think the SQL generator does a good enough job that we can move all that logic into the SQL layer so I don't think we can pre-compile our search queries. This is less serious because the search data results use fewer tables and so it's only 3-4 seconds compile not 12-15 but the customer thinks that still won't really be acceptable to end-users. So we really need to reduce the query compilation time somehow. Any ideas? Profiling points to ELinqQueryState.GetExecutionPlan as the place to start and I have attempted to step into that but without the real .NET 4 source available I couldn't get very far, and the source generated by Reflector won't let me step into some functions or set breakpoints in them. The project was upgraded from .NET 3.5 so I have tried regenerating the EDMX from scratch in EF4 in case there was something wrong with it but that didn't help. I have tried the EFProf utility advertised here but it doesn't look like it would help with this. My large query crashes its data collector anyway. I have run the generated query through SQL performance tuning and it already has 100% index usage. I can't see anything wrong with the database that would cause the query generator problems. Is there something O(n^2) in the execution plan compiler - is breaking this down into blocks of separate data loads rather than all 32 tables at once likely to help? Setting EF to lazy-load didn't help. I've bought the pre-release O'Reilly Julie Lerman EF4 book but I can't find anything in there to help beyond 'compile your queries'. I don't understand why it's taking 12-15 seconds to generate a single select across 32 tables so I'm optimistic there's some scope for improvement! Thanks for any suggestions! We're running against SQL Server 2008 in case that matters and XP / 7 / server 2008 R2 using RTM VS2010.

Read the article
SQL Query Builder/Designer and code Formating

- by DavRob60

I write SQL query every now and then, I could easily write them freehand, but sometimes I do create SQL queries using SQL Query Designers for various reason. (I wont start to enumerate them here and/or argue about their usefulness, so let's just say they are sometime useful.) Anyway, I currently use 2 Query Designers : SQL server management studio's Query Designer. Visual Studio 2010's Query Builder (must often within the Table adapter Query Configuration Wizard.) There's something I hate about those two (I don't know about the others), it's the way they throw away my Code formatting of SQL queries after an edit. Is there any way to configure something to automatically reformat the SQL output or is there any external tool/plug-in that I could use to do that job?

Read the article
Speaking in St. Louis on June 14th

- by Bill Graziano

I’m going back to speak in St. Louis next month. I didn’t make it last year and I’m looking forward to it. You can find additional details on the St. Louis SQL Server user group web site. The meeting will be held at the Microsoft office and I’ll be speaking at 1PM. I’ll be speaking on the procedure cache. As people get better and better tuning queries this is the next major piece to understand. We’ll talk about how and when query plans are reused. The most common issue I see around odd query plans are stored procedures that use one query plan but the queries run completely different when you extract the SQL and hard code the parameters. That’s just one of the common issues that I’ll address. There will be a second speaker after I’m done, then a short vendor presentation and a drawing for a netbook.

Read the article
A Warning to Those Using sys.dm_exec_query_stats

- by Adam Machanic

The sys.dm_exec_query_stats view is one of my favorite DMVs. It has replaced a large chunk of what I used to use SQL Trace for--pulling metrics about what queries are running and how often--and it makes this kind of data collection painless and automatic. What's not to love? But use cases for the view are a topic for another post. Today I want to quickly point out an inconsistency. If you're using this view heavily, as I am, you should know that in some cases your queries will not get a row. One...(read more)

Read the article
Back from PASS Europe 2010

- by Davide Mauri

PASS Europe 2010 is finished and I’m now finally back at home and will stay here for a while. I would like to thanks all the people who has come to my sessions for all their feedback, especially for the “Adaptive BI” session! Slides and demos should be available for download from the PASS European Conference website in a couple of days. Meanwhile if you want to rate my session online, you can do it here: Adaptive BI http://speakerrate.com/talks/3136-adaptive-bi-best-practices Blazing Fast Queries http://speakerrate.com/talks/3135-blazing-fast-queries-when-indexes-are-not-enough Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
How to use TFS as a query tracking system?

- by deostroll

We already use tfs for managing defects in code etc, etc. We additionally need a way to "understand the domain & requirements of the products". Normally, without tfs we exchange emails with the consultants and have the questions/queries answered. If it is a feature implementation we sometimes "find" conflicts in the implementation itself. And when that happens the userstory is modified and the enhancement/bug as per that is raised in TFS. Sometimes it is critical we come back to decisions we made or questions we wanted answers to. Hence we need to be able to track how that "requirement idea" or that "query in concern" evolved. Hence how is it that we can use TFS to track all of this? Do we raise an "issue" item for this? Or do we raise a "bug" item? The main things we'd ideally look in a query tracking system are as follows: Area: Can be a module, submodule, domain. Sometimes this may be "General" - to address domain related stuff, or, event more granular to address modules, sub-modules. Take the case for the latter, if we were tracking this in excel sheets, we'd just write module1,submodule2; i.e. in a comma separated fashion. The things I would like here is to be able search for all queries relating to submodule2 sometime in the future. Responses: This is a record of conversations between the consultant and any other stakeholder. For a simple case, it would just be paragraphs. Each para would start with a name and date enclosed in brackets and the response following that...each para would be like a thread - much like a forum thread Action taken: We'd want to know how the query was closed, what was the input given, what were the changes that took place because of that, etc etc. These are fields I think I would need in such a system apart from some obvious ones like status, address to, resovled by, etc. I am open for any other fields which are sort of important. To summarise my question: how can we manage "queries" in the system? Where should we ideally store data pertaining to those three fields I have mentioned above (for e.g. is it wise to store responses in the history tag assuming we are opening a bug for the query)?

Read the article
Working with Timelines with LINQ to Twitter

- by Joe Mayo

When first working with the Twitter API, I thought that using SinceID would be an effective way to page through timelines. In practice it doesn’t work well for various reasons. To explain why, Twitter published an excellent document that is a must-read for anyone working with timelines: Twitter Documentation: Working with Timelines This post shows how to implement the recommended strategies in that document by using LINQ to Twitter. You should read the document in it’s entirety before moving on because my explanation will start at the bottom and work back up to the top in relation to the Twitter document. What follows is an explanation of SinceID, MaxID, and how they come together to help you efficiently work with Twitter timelines. The Role of SinceID Specifying SinceID says to Twitter, “Don’t return tweets earlier than this”. What you want to do is store this value after every timeline query set so that it can be reused on the next set of queries. The next section will explain what I mean by query set, but a quick explanation is that it’s a loop that gets all new tweets. The SinceID is a backstop to avoid retrieving tweets that you already have. Here’s some initialization code that includes a variable named sinceID that will be used to populate the SinceID property in subsequent queries: // last tweet processed on previous query set ulong sinceID = 210024053698867204; ulong maxID; const int Count = 10; var statusList = new List<status>(); Here, I’ve hard-coded the sinceID variable, but this is where you would initialize sinceID from whatever storage you choose (i.e. a database). The first time you ever run this code, you won’t have a value from a previous query set. Initially setting it to 0 might sound like a good idea, but what if you’re querying a timeline with lots of tweets? Because of the number of tweets and rate limits, your query set might take a very long time to run. A caveat might be that Twitter won’t return an entire timeline back to Tweet #0, but rather only go back a certain period of time, the limits of which are documented for individual Twitter timeline API resources. So, to initialize SinceID at too low of a number can result in a lot of initial tweets, yet there is a limit to how far you can go back. What you’re trying to accomplish in your application should guide you in how to initially set SinceID. I have more to say about SinceID later in this post. The other variables initialized above include the declaration for MaxID, Count, and statusList. The statusList variable is a holder for all the timeline tweets collected during this query set. You can set Count to any value you want as the largest number of tweets to retrieve, as defined by individual Twitter timeline API resources. To effectively page results, you’ll use the maxID variable to set the MaxID property in queries, which I’ll discuss next. Initializing MaxID On your first query of a query set, MaxID will be whatever the most recent tweet is that you get back. Further, you don’t know what MaxID is until after the initial query. The technique used in this post is to do an initial query and then use the results to figure out what the next MaxID will be. Here’s the code for the initial query: var userStatusResponse = (from tweet in twitterCtx.Status where tweet.Type == StatusType.User && tweet.ScreenName == "JoeMayo" && tweet.SinceID == sinceID && tweet.Count == Count select tweet) .ToList(); statusList.AddRange(userStatusResponse); // first tweet processed on current query maxID = userStatusResponse.Min( status => ulong.Parse(status.StatusID)) - 1; The query above sets both SinceID and Count properties. As explained earlier, Count is the largest number of tweets to return, but the number can be less. A couple reasons why the number of tweets that are returned could be less than Count include the fact that the user, specified by ScreenName, might not have tweeted Count times yet or might not have tweeted at least Count times within the maximum number of tweets that can be returned by the Twitter timeline API resource. Another reason could be because there aren’t Count tweets between now and the tweet ID specified by sinceID. Setting SinceID constrains the results to only those tweets that occurred after the specified Tweet ID, assigned via the sinceID variable in the query above. The statusList is an accumulator of all tweets receive during this query set. To simplify the code, I left out some logic to check whether there were no tweets returned. If the query above doesn’t return any tweets, you’ll receive an exception when trying to perform operations on an empty list. Yeah, I cheated again. Besides querying initial tweets, what’s important about this code is the final line that sets maxID. It retrieves the lowest numbered status ID in the results. Since the lowest numbered status ID is for a tweet we already have, the code decrements the result by one to keep from asking for that tweet again. Remember, SinceID is not inclusive, but MaxID is. The maxID variable is now set to the highest possible tweet ID that can be returned in the next query. The next section explains how to use MaxID to help get the remaining tweets in the query set. Retrieving Remaining Tweets Earlier in this post, I defined a term that I called a query set. Essentially, this is a group of requests to Twitter that you perform to get all new tweets. A single query might not be enough to get all new tweets, so you’ll have to start at the top of the list that Twitter returns and keep making requests until you have all new tweets. The previous section showed the first query of the query set. The code below is a loop that completes the query set: do { // now add sinceID and maxID userStatusResponse = (from tweet in twitterCtx.Status where tweet.Type == StatusType.User && tweet.ScreenName == "JoeMayo" && tweet.Count == Count && tweet.SinceID == sinceID && tweet.MaxID == maxID select tweet) .ToList(); if (userStatusResponse.Count > 0) { // first tweet processed on current query maxID = userStatusResponse.Min( status => ulong.Parse(status.StatusID)) - 1; statusList.AddRange(userStatusResponse); } } while (userStatusResponse.Count != 0 && statusList.Count < 30); Here we have another query, but this time it includes the MaxID property. The SinceID property prevents reading tweets that we’ve already read and Count specifies the largest number of tweets to return. Earlier, I mentioned how it was important to check how many tweets were returned because failing to do so will result in an exception when subsequent code runs on an empty list. The code above protects against this problem by only working with the results if Twitter actually returns tweets. Reasons why there wouldn’t be results include: if the first query got all the new tweets there wouldn’t be more to get and there might not have been any new tweets between the SinceID and MaxID settings of the most recent query. The code for loading the returned tweets into statusList and getting the maxID are the same as previously explained. The important point here is that MaxID is being reset, not SinceID. As explained in the Twitter documentation, paging occurs from the newest tweets to oldest, so setting MaxID lets us move from the most recent tweets down to the oldest as specified by SinceID. The two loop conditions cause the loop to continue as long as tweets are being read or a max number of tweets have been read. Logically, you want to stop reading when you’ve read all the tweets and that’s indicated by the fact that the most recent query did not return results. I put the check to stop after 30 tweets are reached to keep the demo from running too long – in the console the response scrolls past available buffer and I wanted you to be able to see the complete output. Yet, there’s another point to be made about constraining the number of items you return at one time. The Twitter API has rate limits and making too many queries per minute will result in an error from twitter that LINQ to Twitter raises as an exception. To use the API properly, you’ll have to ensure you don’t exceed this threshold. Looking at the statusList.Count as done above is rather primitive, but you can implement your own logic to properly manage your rate limit. Yeah, I cheated again. Summary Now you know how to use LINQ to Twitter to work with Twitter timelines. After reading this post, you have a better idea of the role of SinceID - the oldest tweet already received. You also know that MaxID is the largest tweet ID to retrieve in a query. Together, these settings allow you to page through results via one or more queries. You also understand what factors affect the number of tweets returned and considerations for potential error handling logic. The full example of the code for this post is included in the downloadable source code for LINQ to Twitter. @JoeMayo

Read the article
Expected time for an CakePHP MVC form/controller and db make up

- by hephestos

I would like to know, what is an average time for building a form in MVC pattern with for example CakePHP. I build 8 functions, two of them do custom queries, return json data, split them, expand them in a model in memory and delivers to the view. Those are three queries if you consider and an array to feed view for making some combo box. Why? all these, because I have data from json and I split them in order to make row of data like a table. Like that I changed a bit the edit.ctp but not a lot. And I created a javascript outside, with three functions. One collects data the other upon a change of a combo returnes the selected values, and does also some redirection flow. All this, in average how much time should it take ?

Read the article
Why is database developer pay so high? [closed]

- by user433500

Just wondering why someone would get 10k+ in some area in US for just writing queries and creating tables. While the average salary for someone who does scripting, object oriented programming, J2EE and database all together is only ~12K in new york city. Is there similar opportunities in cities like new york where only doing database gets one 10K+? What is the rational of companies paying such a high salary to consultants for just writing simple queries? I am sure college grad can do that with ease and will be quite satisfied with a 60k+ pay for a couple of year. Does location really matter so much?

Read the article
Query-generated Column Expressions

A technique from Bill Nicolich that allows you to target columns by data type for the same custom expression and easily build complex queries.

Read the article
SQL Strings vs. Conditional SQL Statements

- by Yatrix

Is there an advantage to piecemealing sql strings together vs conditional sql statements in SQL Server itself? I have only about 10 months of SQL experience, so I could be speaking out of pure ignorance here. Where I work, I see people building entire queries in strings and concatenating strings together depending on conditions. For example: Set @sql = 'Select column1, column2 from Table 1 ' If SomeCondtion @sql = @sql + 'where column3 = ' + @param1 else @sql = @sql + 'where column4 = ' + @param2 That's a real simple example, but what I'm seeing here is multiple joins and huge queries built from strings and then executed. Some of them even write out what's basically a function to execute, including Declare statements, variables, etc. Is there an advantage to doing it this way when you could do it with just conditions in the sql itself? To me, it seems a lot harder to debug, change and even write vs adding cases, if-elses or additional where parameters to branch the query.

Read the article
What Counts for A DBA - Logic

- by drsql

"There are 10 kinds of people in the world. Those who will always wonder why there are only two items in my list and those who will figured it out the first time they saw this very old joke." Those readers who will give up immediately and get frustrated with me for not explaining it to them are not likely going to be great technical professionals of any sort, much less a programmer or administrator who will be constantly dealing with the common failures that make up a DBA's day. Many of these people will stare at this like a dog staring at a traffic signal and still have no more idea of how to decipher the riddle. Without explanation they will give up, call the joke "stupid" and, feeling quite superior, walk away indignantly to their job likely flipping patties of meat-by-product. As a data professional or any programmer who has strayed to this very data-oriented blog, you would, if you are worth your weight in air, either have recognized immediately what was going on, or felt a bit ignorant. Your friends are chuckling over the joke, but why is it funny? Unfortunately you left your smartphone at home on the dresser because you were up late last night programming and were running late to work (again), so you will either have to fake a laugh or figure it out. Digging through the joke, you figure out that the word "two" is the most important part, since initially the joke mentioned 10. Hmm, why did they spell out two, but not ten? Maybe 10 could be interpreted a different way? As a DBA, this sort of logic comes into play every day, and sometimes it doesn't involve nerdy riddles or Star Wars folklore. When you turn on your computer and get the dreaded blue screen of death, you don't immediately cry to the help desk and sit on your thumbs and whine about not being able to work. Do that and your co-workers will question your nerd-hood; I know I certainly would. You figure out the problem, and when you have it narrowed down, you call the help desk and tell them what the problem is, usually having to explain that yes, you did in fact try to reboot before calling. Of course, sometimes humility does come in to play when you reach the end of your abilities, but the ‘end of abilities’ is not something any of us recognize readily. It is handy to have the ability to use logic to solve uncommon problems: It becomes especially useful when you are trying to solve a data-related problem such as a query performance issue, and the way that you approach things will tell your coworkers a great deal about your abilities. The novice is likely to immediately take the approach of trying to add more indexes or blaming the hardware. As you become more and more experienced, it becomes increasingly obvious that performance issues are a very complex topic. A query may be slow for a myriad of reasons, from concurrency issues, a poor query plan because of a parameter value (like parameter sniffing,) poor coding standards, or just because it is a complex query that is going to be slow sometimes. Some queries that you will deal with may have twenty joins and hundreds of search criteria, and it can take a lot of thought to determine what is going on. You can usually figure out the problem to almost any query by using basic knowledge of how joins and queries work, together with the help of such things as the query plan, profiler or monitoring tools. It is not unlikely that it can take a full day’s work to understand some queries, breaking them down into smaller queries to find a very tiny problem. Not every time will you actually find the problem, and it is part of the process to occasionally admit that the problem is random, and everything works fine now. Sometimes, it is necessary to realize that a problem is outside of your current knowledge, and admit temporary defeat: You can, at least, narrow down the source of the problem by looking logically at all of the possible solutions. By doing this, you can satisfy your curiosity and learn more about what the actual problem was. For example, in the joke, had you never been exposed to the concept of binary numbers, there is no way you could have known that binary - 10 = decimal - 2, but you could have logically come to the conclusion that 10 must not mean ten in the context of the joke, and at that point you are that much closer to getting the joke and at least won't feel so ignorant.

Read the article
Free eBook: 45 Database Performance Tips for Developers

As a developer, if you need to go into the database and write queries, design tables, or determine the configuration of your SQL Server Systems, these tips should help make sure you're not unnecessarily sacrificing database performance. This eBook has 45 easy tips to improve the performance of your indexes and T-SQL queries, and hunt down problems within ORM tools and database design. Save 45% on our top SQL Server database administration tools. Together they make up the SQL DBA Bundle, which supports your core tasks and helps your day run smoothly. Download a free trial now.

Read the article
Phantom activity on MySQL

- by LoveMeSomeCode

This is probably just my total lack of MySQL expertise, but is it typical to see lots of phantom activity on a MySQL instance via phpMyAdmin? I have a shared hosting plan through Lithium, and when I log in through the phpMyAdmin console and click on the 'Status' tab, it's showing crazy high numbers for queries. Within an hour of activating my account I had 1 million queries. At first I thought this was them setting things up, but the number is climbing constantly, averaging 170/second. I've got a support ticket in with Lithium, but I thought I'd ask here if this were a MySQL/shared host thing, because I had the same thing happen with a shared hosting plan through Joyent.

Read the article
How should I handle search engines auto-correcting the spelling of a site's name?

- by Nathan G.

A client's site and company is called 'Tranin Communications' (Tranin is her last name). It ranks well in searches for her name but rather poorly in searches for the name of her site/company. I realized that this is largely due to* search engines (Google especially) assuming that the query was misspelled and automatically including results for both 'train communications' and 'communications training'. Both of those queries yield many high-ranking sites that completely drown out hers. Sometimes Google even shows results for 'communications training' instead of 'tranin communications', hiding her site altogether. Is there a way to report an incorrect auto-correction to Google or something I can do to discourage this behavior (e.g. a meta tag)? My searches have come up cold, any suggestions would be appreciated. *I've come to this conclusion because her site ranks very highly when the same queries are put in quotes.

Read the article
Coherence Query Performance in Large Clusters

- by jpurdy

Large clusters (measured in terms of the number of storage-enabled members participating in the largest cache services) may introduce challenges when issuing queries. There is no particular cluster size threshold for this, rather a gradually increasing tendency for issues to arise. The most obvious challenges are that a client's perceived query latency will be determined by the slowest responder (more likely to be a factor in larger clusters) as well as the fact that adding additional cache servers will not increase query throughput if the query processing is not compute-bound (which would generally be the case for most indexed queries). If the data set can take advantage of the partition affinity features of Coherence, then the application can use a PartitionedFilter to target a query to a single server (using partition affinity to ensure that all data is in a single partition). If this can not be done, then avoiding an excessive number of cache server JVMs will help, as will ensuring that each cache server has sufficient CPU resources available and is also properly configured to minimize GC pauses (the most common cause of a slow-responding cache server).

Read the article

Search Results

Search found 7116 results on 285 pages for 'nested queries'.

Page 82/285 | < Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89 | Next Page >

- by Kuntal Shah

- by John

- by DigiMortal

- by Ray Eatmon

- by Grant Fritchey

- by Joe Mayo

- by DrJohn

- by Mike Willliams

- by Anders

- by Rup

- by DavRob60

- by Bill Graziano

- by Adam Machanic

- by Davide Mauri

- by deostroll

- by Joe Mayo

- by hephestos

- by user433500

- by Yatrix

- by drsql

- by LoveMeSomeCode

- by Nathan G.

- by jpurdy

< Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89 | Next Page >