adhoc queries - Page 64

What method of MySQL mirroring should I use for this?

- by user45745

I'm running an web application hosting service (basically hosting forums for free), and I have two remote servers at my disposal. The code for the application is stored on both servers and isn't a problem, but I'm wondering how to deal with the databases. When someone goes onto a site *.example-host.com, they are sent to one of the two servers and both must be capable of loading the forums from a database. The database must also have write access, for when new members register or post topics etc. The main requirement is speed, but uptime is also important (if a server goes out, the site should still work). I have a few options, but I'm inexperienced and not sure which to go with: 1) [PHP] Split the forum records 50:50 between the two servers. If a server does not have the record for a forum requested, it can request it from the other by remote MySQL and load it. This idea sounded okay, until I realised that 50% of the time, users would be waiting significantly longer for pages to load. I also realised that if one of the servers went down, half the forums would be inaccessible and registrations would have to be disabled. 2) [MySQL] Dual master replication. This would attempt to mirror the two databases and sounds perfect, but I've heard that it can be very problematic. I don't know how fast this is. 3) [MySQL] Use a standard replication, distribute read only queries on both nodes and read/write queries to the master. This sounds like a good option, but again, I'm not sure on speed. I also don't know what would happen if the master server went down. If you have any other suggestions, please post them :)

Read the article

Managing records of bugs and notes

- by Jim

Hi. I want to create a knowledgebase for a piece of software. I'd also like to be able to track bugs and common points of failure in that application. Linking knowledgebase articles to bug records would be a real boon, as would the ability to do complex queries for particular articles and bugs on the basis of tags or metadata. I've never done anything like this before, and like to install as little as possible. I've been looking at creating a wiki with Wiki On A Stick, and it seems to offer a lot. But I can't make complex queries. I can create pages that list all 'articles' with a particular single tag, but I can't specify multiple tags or filters. Is there any software that can help? I don't want to spend money until I've tried something out thoroughly, and I'd ideally like something that demands little-to-no installation. Are there any tools that can help me? If something could easily export its data, or stored data in XML, that would be a real plus too. Otherwise, are there any simple apps that allow me to set up forms for bugs, store data as XML then query and process that XML on demand? Thanks in advance.

Read the article

Server down at 23:26 every night

- by miccet

We are having a big problem with our sites stability the last couple of weeks and after endless hours of troubleshooting I don't get anywhere. So I turn to you dear community. Setup: 2 x VPS servers - Front end, 8 core, 8G RAM. - Database, 5 core, 3G RAM. Both running Ubuntu. Ruby on Rails EE with Passenger 3 and Rails 2.3.11. MySQL 5.1.67. The problem is that each night, at the exact same time (23:26) the SQL server suddenly shows a processlist full of COMMIT with an increasing Time. After 30-40 seconds (can go longer) a wave seems processed and the site responds for a few seconds before it repeats. During this hick up the database server load spikes while the front end is relaxing. I have looked at slow queries, but is not finding any locks or other unusual queries ran at this time. I have looked at iotop at the time of the halt and there is no activity from mysql. I also tried turning off query_cache and messed around with the mysql configuration file without much change. Any ideas?

Read the article

Mysql: Working With 192 Trillion Records... (Yes, 192 Trillion)

- by Sarah

Here's the question... Considering 192 trillion records, what should my considerations be? My main concern is speed. Here's the table... CREATE TABLE `ref` ( `id` INTEGER(13) AUTO_INCREMENT DEFAULT NOT NULL, `rel_id` INTEGER(13) NOT NULL, `p1` INTEGER(13) NOT NULL, `p2` INTEGER(13) DEFAULT NULL, `p3` INTEGER(13) DEFAULT NULL, `s` INTEGER(13) NOT NULL, `p4` INTEGER(13) DEFAULT NULL, `p5` INTEGER(13) DEFAULT NULL, `p6` INTEGER(13) DEFAULT NULL, PRIMARY KEY (`id`), KEY (`s`), KEY (`rel_id`), KEY (`p3`), KEY (`p4`) ); Here's the queries... SELECT id, s FROM ref WHERE red_id="$rel_id" AND p3="$p3" AND p4="$p4" SELECT rel_id, p1, p2, p3, p4, p5, p6 FROM ref WHERE id="$id" INSERT INTO rel (rel_id, p1, p2, p3, s, p4, p5, p6) VALUES ("$rel_id", "$p1", "$p2", "$p3", "$s", "$p4", "$p5", "$p6") Here's some notes... The SELECT's will be done much more frequently than the INSERT. However, occasionally I want to add a few hundred records at a time. Load-wise, there will be nothing for hours then maybe a few thousand queries all at once. Don't think I can normalize any more (need the p values in a combination) The database as a whole is very relational. This will be the largest table by far (next largest is about 900k) UPDATE (08/11/2010) Interestingly, I've been given a second option... Instead of 192 trillion I could store 2.6*10^16 (15 zeros, meaning 26 Quadrillion)... But in this second option I would only need to store one bigint(18) as the index in a table. That's it - just the one column. So I would just be checking for the existence of a value. Occasionally adding records, never deleting them. So that makes me think there must be a better solution then mysql for simply storing numbers... Given this second option, should I take it or stick with the first... [edit] Just got news of some testing that's been done - 100 million rows with this setup returns the query in 0.0004 seconds [/edit]

Read the article

Software Architecture and Software Architecture Evaluation

How many of us have worked at places where the concept of software architecture was ridiculed for wasting time and money? Even more ridiculous to them was the concept of evaluating software architecture. I think the next time that I am in this situation again, and I hope that I never am I will have to push for this methodology in the software development life cycle. I have spent way too many hours/days/months/years working poorly architected systems or systems that were just built ADHOC. This in software development must stop. I can understand why systems get like this due to overzealous sales staff, demanding management that wants everything yesterday, and project managers asking if things are done yet before the project has even started. But seriously, some time must be spent designing the applications that we write along with evaluating the architecture so that it will integrate will within the existing systems of an origination. If placed in this situation again, I will strive to gain buying from key players within the business, for example: Senior Software Engineers\Developers, Software Architects, Project Managers, Software Quality Assurance, Technical Services, Operations, and Finance in order for this idea to succeed with upper management. In order to convince these key players I will have to show them the benefits of architecture and even more benefits of evaluating software architecture on a system wide level. Benefits of Software Architecture Evaluation Places Stakeholders in the Same Room to Communicate Ensures Delivery of Detailed Quality Goals Prioritizes Conflicting Goals Requires Clear Explication Improves the Quality of Documentation Discovers Opportunities for Cross-Project Reuse Improves Architecture Practices Once I had key player buy in then and only then would I approach upper management about my plan regarding implementing the concept of software architecture and using evaluation to ensure that the software being designed is the proper architecture for the project. In addition to the benefits listed above I would also show upper management how much time is being wasted by not doing these evaluations. For example, if project X cost us Y amount, then why do we have several implementations in various forms of X and how much money and time could we have saved if we just reused the existing code base to give each system the same functionality that was already created? After this, I would mention what would happen if we had 50 instances of this situation? Then I would show them how the software architecture evaluation process would have prevented this and that the optimization could have leveraged its existing code base to increase the speed and quality of its development. References:Carnegie Mellon Software Engineering Institute (2011). Architecture Tradeoff Analysis Method from http://www.sei.cmu.edu/architecture/tools/evaluate/atam.cfm

Read the article

What causes Multi-Page allocations?

- by SQLOS Team

Writing about changes in the Denali Memory Manager In his last post Rusi mentioned: " In previous SQL versions only the 8k allocations were limited by the ‘max server memory’ configuration option. Allocations larger than 8k weren’t constrained." In SQL Server versions before Denali single page allocations and multi-Page allocations are handled by different components, the Single Page Allocator (which is responsible for Buffer Pool allocations and governed by 'max server memory') and the Multi-Page allocator (MPA) which handles allocations of greater than an 8K page. If there are many multi-page allocations this can affect how much memory needs to be reserved outside 'max server memory' which may in turn involve setting the -g memory_to_reserve startup parameter. We'll follow up with more generic articles on the new Memory Manager structure, but in this post I want to clarify what might cause these larger allocations. So what kinds of query result in MPA activity? I was asked this question the other day after delivering an MCM webcast on Memory Manager changes in Denali. After asking around our Dev team I was connected to one of our test leads Sangeetha who had tested the plan cache, and kindly provided this example of an MPA intensive query: A workload that has stored procedures with a large # of parameters (say > 100, > 500), and then invoked via large ad hoc batches, where each SP has different parameters will result in a plan being cached for this “exec proc” batch. This plan will result in MPA. Exec proc_name @p1, ….@p500 Exec proc_name @p1, ….@p500 . . . Exec proc_name @p1, ….@p500 Go Another workload would be large adhoc batches of the form: Select * from t where col1 in (1, 2, 3, ….500) Select * from t where col1 in (1, 2, 3, ….500) Select * from t where col1 in (1, 2, 3, ….500) … Go In Denali all page allocations are handled by an "Any size page allocator" and included in 'max server memory'. The buffer pool effectively becomes a client of the any size page allocator, which in turn relies on the memory manager. - Guy Originally posted at http://blogs.msdn.com/b/sqlosteam/

Read the article

How do I know if I am using Scrum methodologies?

- by Jake

When I first started at my current job, my purpose was to rewrite a massive excel-VBA workbook-application to C# Winforms because it was thought that the new C# app will fix all existing problems and have all the new features for a perfect world. If it were a direct port, in theory it would be easy as i just need to go through all the formulas, conditional formatting, validations, VBA etc. to understand it. However, that was not the case. Many of the new features are tightly dependant on business logic which I am unfamiliar with. As a solo programmer, the first year was spent solely on deciphering the excel workbook and writing the C# app. In theory, I had the business people to "help" me specify requirements, how GUI looks and work, and testing of the app etc; but in practice it is like a contant tsunami of feature creep. At the beginning of the second year I managed to convince the management that this is not going anywhere. I made them start from scratch with the excel-VBA. I have this "issue log" saved on the network, each time they found something they didn't like about the excel-VBA app, they will write it in there. I check the log daily and consolidate issues (in my mind) mainly into 2 groups: (1) requires massive change. (2) can be fixed in current version. For massive change issues, I make a copy of the latest excel-VBA and give it a new version number, then work on it whenever I can. For current version fixes, I make the changes in a few days to a week, and then immediately release it. I also ensure I update the same change in any in-progress massive change future versions. This has gone on for about 4 months and I feel it works great. I made many releases and solved many real issues, also understood the business logic more and more. However, my boss (non-IT trained) thinks what I am doing are just adhoc changes and that i am not looking at the "bigger picture". I am struggling to convince my boss that this works. So I hope to formalise my approach and maybe borrow a buzzword to confuse him. Incidentally, I read about Agile and SCRUM, about backlog and sprints. But it's all very vague to me still. QUESTION (finally): I want to tell him that this is SCRUM! But I want to hold my breath first and ask whether my current approach is considered SCRUM or SCRUM-like? How can I make it more SCRUM-like? Note that I have only myself, there's no project leader or teams.

Read the article

Best practices for using the Entity Framework with WPF DataBinding

- by Ken Smith

I'm in the process of building my first real WPF application (i.e., the first intended to be used by someone besides me), and I'm still wrapping my head around the best way to do things in WPF. It's a fairly simple data access application using the still-fairly-new Entity Framework, but I haven't been able to find a lot of guidance online for the best way to use these two technologies (WPF and EF) together. So I thought I'd toss out how I'm approaching it, and see if anyone has any better suggestions. I'm using the Entity Framework with SQL Server 2008. The EF strikes me as both much more complicated than it needs to be, and not yet mature, but Linq-to-SQL is apparently dead, so I might as well use the technology that MS seems to be focusing on. This is a simple application, so I haven't (yet) seen fit to build a separate data layer around it. When I want to get at data, I use fairly simple Linq-to-Entity queries, usually straight from my code-behind, e.g.: var families = from family in entities.Family.Include("Person") orderby family.PrimaryLastName, family.Tag select family; Linq-to-Entity queries return an IOrderedQueryable result, which doesn't automatically reflect changes in the underlying data, e.g., if I add a new record via code to the entity data model, the existence of this new record is not automatically reflected in the various controls referencing the Linq query. Consequently, I'm throwing the results of these queries into an ObservableCollection, to capture underlying data changes: familyOC = new ObservableCollection<Family>(families.ToList()); I then map the ObservableCollection to a CollectionViewSource, so that I can get filtering, sorting, etc., without having to return to the database. familyCVS.Source = familyOC; familyCVS.View.Filter = new Predicate<object>(ApplyFamilyFilter); familyCVS.View.SortDescriptions.Add(new System.ComponentModel.SortDescription("PrimaryLastName", System.ComponentModel.ListSortDirection.Ascending)); familyCVS.View.SortDescriptions.Add(new System.ComponentModel.SortDescription("Tag", System.ComponentModel.ListSortDirection.Ascending)); I then bind the various controls and what-not to that CollectionViewSource: <ListBox DockPanel.Dock="Bottom" Margin="5,5,5,5" Name="familyList" ItemsSource="{Binding Source={StaticResource familyCVS}, Path=., Mode=TwoWay}" IsSynchronizedWithCurrentItem="True" ItemTemplate="{StaticResource familyTemplate}" SelectionChanged="familyList_SelectionChanged" /> When I need to add or delete records/objects, I manually do so from both the entity data model, and the ObservableCollection: private void DeletePerson(Person person) { entities.DeleteObject(person); entities.SaveChanges(); personOC.Remove(person); } I'm generally using StackPanel and DockPanel controls to position elements. Sometimes I'll use a Grid, but it seems hard to maintain: if you want to add a new row to the top of your grid, you have to touch every control directly hosted by the grid to tell it to use a new line. Uggh. (Microsoft has never really seemed to get the DRY concept.) I almost never use the VS WPF designer to add, modify or position controls. The WPF designer that comes with VS is sort of vaguely helpful to see what your form is going to look like, but even then, well, not really, especially if you're using data templates that aren't binding to data that's available at design time. If I need to edit my XAML, I take it like a man and do it manually. Most of my real code is in C# rather than XAML. As I've mentioned elsewhere, entirely aside from the fact that I'm not yet used to "thinking" in it, XAML strikes me as a clunky, ugly language, that also happens to come with poor designer and intellisense support, and that can't be debugged. Uggh. Consequently, whenever I can see clearly how to do something in C# code-behind that I can't easily see how to do in XAML, I do it in C#, with no apologies. There's been plenty written about how it's a good practice to almost never use code-behind in WPF page (say, for event-handling), but so far at least, that makes no sense to me whatsoever. Why should I do something in an ugly, clunky language with god-awful syntax, an astonishingly bad editor, and virtually no type safety, when I can use a nice, clean language like C# that has a world-class editor, near-perfect intellisense, and unparalleled type safety? So that's where I'm at. Any suggestions? Am I missing any big parts of this? Anything that I should really think about doing differently?

Read the article

Lots of mysql Sleep processes

- by user259284

Hello, I am still having trouble with my mysql server. It seems that since i optimize it, the tables were growing and now sometimes is very slow again. I have no idea of how to optimize more. mySQL server has 48GB of RAM and mysqld is using about 8, most of the tables are innoDB. Site has about 2000 users online. I also run explain on every query and every one of them is indexed. mySQL processes: http://www.pik.ba/mysqlStanje.php my.cnf: # The MySQL database server configuration file. # # You can copy this to one of: # - "/etc/mysql/my.cnf" to set global options, # - "~/.my.cnf" to set user-specific options. # # One can use all long options that the program supports. # Run program with --help to get a list of available options and with # --print-defaults to see which it would actually understand and use. # # For explanations see # http://dev.mysql.com/doc/mysql/en/server-system-variables.html # This will be passed to all mysql clients # It has been reported that passwords should be enclosed with ticks/quotes # escpecially if they contain "#" chars... # Remember to edit /etc/mysql/debian.cnf when changing the socket location. [client] port = 3306 socket = /var/run/mysqld/mysqld.sock # Here is entries for some specific programs # The following values assume you have at least 32M ram # This was formally known as [safe_mysqld]. Both versions are currently parsed. [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 [mysqld] # # * Basic Settings # user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp language = /usr/share/mysql/english skip-external-locking # # Instead of skip-networking the default is now to listen only on # localhost which is more compatible and is not less secure. bind-address = 10.100.27.30 # # * Fine Tuning # key_buffer = 64M key_buffer_size = 512M max_allowed_packet = 16M thread_stack = 128K thread_cache_size = 8 # This replaces the startup script and checks MyISAM tables if needed # the first time they are touched myisam-recover = BACKUP max_connections = 1000 table_cache = 1000 join_buffer_size = 2M tmp_table_size = 2G max_heap_table_size = 2G innodb_buffer_pool_size = 3G innodb_additional_mem_pool_size = 128M innodb_log_file_size = 100M log-slow-queries = /var/log/mysql/slow.log sort_buffer_size = 5M net_buffer_length = 5M read_buffer_size = 2M read_rnd_buffer_size = 12M thread_concurrency = 10 ft_min_word_len = 3 #thread_concurrency = 10 # # * Query Cache Configuration # query_cache_limit = 1M query_cache_size = 512M # # * Logging and Replication # # Both location gets rotated by the cronjob. # Be aware that this log type is a performance killer. #log = /var/log/mysql/mysql.log # # Error logging goes to syslog. This is a Debian improvement :) # # Here you can see queries with especially long duration #log_slow_queries = /var/log/mysql/mysql-slow.log #long_query_time = 2 #log-queries-not-using-indexes # # The following can be used as easy to replay backup logs or for replication. # note: if you are setting up a replication slave, see README.Debian about # other settings you may need to change. #server-id = 1 #log_bin = /var/log/mysql/mysql-bin.log expire_logs_days = 10 max_binlog_size = 100M #binlog_do_db = include_database_name #binlog_ignore_db = include_database_name # # * BerkeleyDB # # Using BerkeleyDB is now discouraged as its support will cease in 5.1.12. skip-bdb # # * InnoDB # # InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/. # Read the manual for more InnoDB related options. There are many! # You might want to disable InnoDB to shrink the mysqld process by circa 100MB. #skip-innodb # # * Security Features # # Read the manual, too, if you want chroot! # chroot = /var/lib/mysql/ # # For generating SSL certificates I recommend the OpenSSL GUI "tinyca". # # ssl-ca=/etc/mysql/cacert.pem # ssl-cert=/etc/mysql/server-cert.pem # ssl-key=/etc/mysql/server-key.pem [mysqldump] quick quote-names max_allowed_packet = 16M [mysql] #no-auto-rehash # faster start of mysql but no tab completition [isamchk] key_buffer = 16M # # * NDB Cluster # # See /usr/share/doc/mysql-server-*/README.Debian for more information. # # The following configuration is read by the NDB Data Nodes (ndbd processes) # not from the NDB Management Nodes (ndb_mgmd processes). # # [MYSQL_CLUSTER] # ndb-connectstring=127.0.0.1 # # * IMPORTANT: Additional settings that can override those from this file! # The files must end with '.cnf', otherwise they'll be ignored. # !includedir /etc/mysql/conf.d/

Read the article

Server Memory with Magento

- by Mohamed Elgharabawy

I have a cloud server with the following specifications: 2vCPUs 4G RAM 160GB Disk Space Network 400Mb/s System Image: Ubuntu 12.04 LTS I am only running Magento CE 1.7.0.2 on this server. Nothing else. Usually, the server has a loading time of 4-5 seconds. Recently, this has dropped to over 30 seconds and sometimes the server just goes away and I get HTTP error reports to my email stating that HTTP requests took more than 20000ms. Running top command and sorting them returns the following: top - 15:29:07 up 3:40, 1 user, load average: 28.59, 25.95, 22.91 Tasks: 112 total, 30 running, 82 sleeping, 0 stopped, 0 zombie Cpu(s): 90.2%us, 9.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.3%si, 0.2%st PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31901 www-data 20 0 360m 71m 5840 R 7 1.8 1:39.51 apache2 32084 www-data 20 0 362m 72m 5548 R 7 1.8 1:31.56 apache2 32089 www-data 20 0 348m 59m 5660 R 7 1.5 1:41.74 apache2 32295 www-data 20 0 343m 54m 5532 R 7 1.4 2:00.78 apache2 32303 www-data 20 0 354m 65m 5260 R 7 1.6 1:38.76 apache2 32304 www-data 20 0 346m 56m 5544 R 7 1.4 1:41.26 apache2 32305 www-data 20 0 348m 59m 5640 R 7 1.5 1:50.11 apache2 32291 www-data 20 0 358m 69m 5256 R 6 1.7 1:44.26 apache2 32517 www-data 20 0 345m 56m 5532 R 6 1.4 1:45.56 apache2 30473 www-data 20 0 355m 66m 5680 R 6 1.7 2:00.05 apache2 32093 www-data 20 0 352m 63m 5848 R 6 1.6 1:53.23 apache2 32302 www-data 20 0 345m 56m 5512 R 6 1.4 1:55.87 apache2 32433 www-data 20 0 346m 57m 5500 S 6 1.4 1:31.58 apache2 32638 www-data 20 0 354m 65m 5508 R 6 1.6 1:36.59 apache2 32230 www-data 20 0 347m 57m 5524 R 6 1.4 1:33.96 apache2 32231 www-data 20 0 355m 66m 5512 R 6 1.7 1:37.47 apache2 32233 www-data 20 0 354m 64m 6032 R 6 1.6 1:59.74 apache2 32300 www-data 20 0 355m 66m 5672 R 6 1.7 1:43.76 apache2 32510 www-data 20 0 347m 58m 5512 R 6 1.5 1:42.54 apache2 32521 www-data 20 0 348m 59m 5508 R 6 1.5 1:47.99 apache2 32639 www-data 20 0 344m 55m 5512 R 6 1.4 1:34.25 apache2 32083 www-data 20 0 345m 56m 5696 R 5 1.4 1:59.42 apache2 32085 www-data 20 0 347m 58m 5692 R 5 1.5 1:42.29 apache2 32293 www-data 20 0 353m 64m 5676 R 5 1.6 1:52.73 apache2 32301 www-data 20 0 348m 59m 5564 R 5 1.5 1:49.63 apache2 32528 www-data 20 0 351m 62m 5520 R 5 1.6 1:36.11 apache2 31523 mysql 20 0 3460m 576m 8288 S 5 14.4 2:06.91 mysqld 32002 www-data 20 0 345m 55m 5512 R 5 1.4 2:01.88 apache2 32080 www-data 20 0 357m 68m 5512 S 5 1.7 1:31.30 apache2 32163 www-data 20 0 347m 58m 5512 S 5 1.5 1:58.68 apache2 32509 www-data 20 0 345m 56m 5504 R 5 1.4 1:49.54 apache2 32306 www-data 20 0 358m 68m 5504 S 4 1.7 1:53.29 apache2 32165 www-data 20 0 344m 55m 5524 S 4 1.4 1:40.71 apache2 32640 www-data 20 0 345m 56m 5528 R 4 1.4 1:36.49 apache2 31888 www-data 20 0 359m 70m 5664 R 4 1.8 1:57.07 apache2 32511 www-data 20 0 357m 67m 5512 S 3 1.7 1:47.00 apache2 32054 www-data 20 0 357m 68m 5660 S 2 1.7 1:53.10 apache2 1 root 20 0 24452 2276 1232 S 0 0.1 0:01.58 init Moreover, running free -m returns the following: total used free shared buffers cached Mem: 4003 3919 83 0 118 901 -/+ buffers/cache: 2899 1103 Swap: 0 0 0 To investigate this further, I have installed apache buddy, it recommeneded that I need to reduce the maxclient connections. Which I did. I also installed MysqlTuner and it suggests that I need to set my innodb_buffer_pool_size to = 3.0G. However, I cannot do that, since the whole memory is 4G. Here is the output from apache buddy: ### GENERAL REPORT ### Settings considered for this report: Your server's physical RAM: 4003MB Apache's MaxClients directive: 40 Apache MPM Model: prefork Largest Apache process (by memory): 73.77MB [ OK ] Your MaxClients setting is within an acceptable range. Max potential memory usage: 2950.8 MB Percentage of RAM allocated to Apache 73.72 % And this is the output of MySQLTuner: -------- Performance Metrics ------------------------------------------------- [--] Up for: 47m 22s (675K q [237.552 qps], 12K conn, TX: 1B, RX: 300M) [--] Reads / Writes: 45% / 55% [--] Total buffers: 2.1G global + 2.7M per thread (151 max threads) [OK] Maximum possible memory usage: 2.5G (64% of installed RAM) [OK] Slow queries: 0% (0/675K) [OK] Highest usage of available connections: 26% (40/151) [OK] Key buffer size / total MyISAM indexes: 36.0M/18.7M [OK] Key buffer hit rate: 100.0% (245K cached / 105 reads) [OK] Query cache efficiency: 92.5% (500K cached / 541K selects) [!!] Query cache prunes per day: 302886 [OK] Sorts requiring temporary tables: 0% (1 temp sorts / 15K sorts) [!!] Joins performed without indexes: 12135 [OK] Temporary tables created on disk: 25% (8K on disk / 32K total) [OK] Thread cache hit rate: 90% (1K created / 12K connections) [!!] Table cache hit rate: 17% (400 open / 2K opened) [OK] Open file limit used: 12% (123/1K) [OK] Table locks acquired immediately: 100% (196K immediate / 196K locks) [!!] InnoDB buffer pool / data size: 2.0G/3.5G [OK] InnoDB log waits: 0 -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance MySQL started within last 24 hours - recommendations may be inaccurate Enable the slow query log to troubleshoot bad queries Adjust your join queries to always utilize indexes Increase table_cache gradually to avoid file descriptor limits Read this before increasing table_cache over 64: http://bit.ly/1mi7c4C Variables to adjust: query_cache_size ( 64M) join_buffer_size ( 128.0K, or always use indexes with joins) table_cache ( 400) innodb_buffer_pool_size (= 3G) Last but not least, the server still has more than 60% of free disk space. Now, based on the above, I have few questions: Are these numbers normal? Do they make sense? Do I need to upgrade the server? If I don't need to upgrade and my configuration is not correct, how do I optimize it?

Read the article

Query doesn't use a covering-index when applicable

- by Dor

I've downloaded the employees database and executed some queries for benchmarking purposes. Then I noticed that one query didn't use a covering index, although there was a corresponding index that I created earlier. Only when I added a FORCE INDEX clause to the query, it used a covering index. I've uploaded two files, one is the executed SQL queries and the other is the results. Can you tell why the query uses a covering-index only when a FORCE INDEX clause is added? The EXPLAIN shows that in both cases, the index dept_no_from_date_idx is being used anyway. To adapt myself to the standards of SO, I'm also writing the content of the two files here: The SQL queries: USE employees; /* Creating an index for an index-covered query */ CREATE INDEX dept_no_from_date_idx ON dept_emp (dept_no, from_date); /* Show `dept_emp` table structure, indexes and generic data */ SHOW TABLE STATUS LIKE "dept_emp"; DESCRIBE dept_emp; SHOW KEYS IN dept_emp; /* The EXPLAIN shows that the subquery doesn't use a covering-index */ EXPLAIN SELECT SQL_NO_CACHE * FROM dept_emp INNER JOIN ( /* The subquery should use a covering index, but isn't */ SELECT SQL_NO_CACHE emp_no, dept_no FROM dept_emp WHERE dept_no="d001" ORDER BY from_date DESC LIMIT 20000,50 ) AS `der` USING (`emp_no`, `dept_no`); /* The EXPLAIN shows that the subquery DOES use a covering-index, thanks to the FORCE INDEX clause */ EXPLAIN SELECT SQL_NO_CACHE * FROM dept_emp INNER JOIN ( /* The subquery use a covering index */ SELECT SQL_NO_CACHE emp_no, dept_no FROM dept_emp FORCE INDEX(dept_no_from_date_idx) WHERE dept_no="d001" ORDER BY from_date DESC LIMIT 20000,50 ) AS `der` USING (`emp_no`, `dept_no`); The results: -------------- /* Creating an index for an index-covered query */ CREATE INDEX dept_no_from_date_idx ON dept_emp (dept_no, from_date) -------------- Query OK, 331603 rows affected (33.95 sec) Records: 331603 Duplicates: 0 Warnings: 0 -------------- /* Show `dept_emp` table structure, indexes and generic data */ SHOW TABLE STATUS LIKE "dept_emp" -------------- +----------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+ | Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment | +----------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+ | dept_emp | InnoDB | 10 | Compact | 331883 | 36 | 12075008 | 0 | 21544960 | 29360128 | NULL | 2010-05-04 13:07:49 | NULL | NULL | utf8_general_ci | NULL | | | +----------+--------+---------+------------+--------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+ 1 row in set (0.47 sec) -------------- DESCRIBE dept_emp -------------- +-----------+---------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+---------+------+-----+---------+-------+ | emp_no | int(11) | NO | PRI | NULL | | | dept_no | char(4) | NO | PRI | NULL | | | from_date | date | NO | | NULL | | | to_date | date | NO | | NULL | | +-----------+---------+------+-----+---------+-------+ 4 rows in set (0.05 sec) -------------- SHOW KEYS IN dept_emp -------------- +----------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +----------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ | dept_emp | 0 | PRIMARY | 1 | emp_no | A | 331883 | NULL | NULL | | BTREE | | | dept_emp | 0 | PRIMARY | 2 | dept_no | A | 331883 | NULL | NULL | | BTREE | | | dept_emp | 1 | emp_no | 1 | emp_no | A | 331883 | NULL | NULL | | BTREE | | | dept_emp | 1 | dept_no | 1 | dept_no | A | 7 | NULL | NULL | | BTREE | | | dept_emp | 1 | dept_no_from_date_idx | 1 | dept_no | A | 13 | NULL | NULL | | BTREE | | | dept_emp | 1 | dept_no_from_date_idx | 2 | from_date | A | 165941 | NULL | NULL | | BTREE | | +----------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 6 rows in set (0.23 sec) -------------- /* The EXPLAIN shows that the subquery doesn't use a covering-index */ EXPLAIN SELECT SQL_NO_CACHE * FROM dept_emp INNER JOIN ( /* The subquery should use a covering index, but isn't */ SELECT SQL_NO_CACHE emp_no, dept_no FROM dept_emp WHERE dept_no="d001" ORDER BY from_date DESC LIMIT 20000,50 ) AS `der` USING (`emp_no`, `dept_no`) -------------- +----+-------------+------------+--------+----------------------------------------------+-----------------------+---------+------------------------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+----------------------------------------------+-----------------------+---------+------------------------+-------+-------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 50 | | | 1 | PRIMARY | dept_emp | eq_ref | PRIMARY,emp_no,dept_no,dept_no_from_date_idx | PRIMARY | 16 | der.emp_no,der.dept_no | 1 | | | 2 | DERIVED | dept_emp | ref | dept_no,dept_no_from_date_idx | dept_no_from_date_idx | 12 | | 21402 | Using where | +----+-------------+------------+--------+----------------------------------------------+-----------------------+---------+------------------------+-------+-------------+ 3 rows in set (0.09 sec) -------------- /* The EXPLAIN shows that the subquery DOES use a covering-index, thanks to the FORCE INDEX clause */ EXPLAIN SELECT SQL_NO_CACHE * FROM dept_emp INNER JOIN ( /* The subquery use a covering index */ SELECT SQL_NO_CACHE emp_no, dept_no FROM dept_emp FORCE INDEX(dept_no_from_date_idx) WHERE dept_no="d001" ORDER BY from_date DESC LIMIT 20000,50 ) AS `der` USING (`emp_no`, `dept_no`) -------------- +----+-------------+------------+--------+----------------------------------------------+-----------------------+---------+------------------------+-------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+----------------------------------------------+-----------------------+---------+------------------------+-------+--------------------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 50 | | | 1 | PRIMARY | dept_emp | eq_ref | PRIMARY,emp_no,dept_no,dept_no_from_date_idx | PRIMARY | 16 | der.emp_no,der.dept_no | 1 | | | 2 | DERIVED | dept_emp | ref | dept_no_from_date_idx | dept_no_from_date_idx | 12 | | 37468 | Using where; Using index | +----+-------------+------------+--------+----------------------------------------------+-----------------------+---------+------------------------+-------+--------------------------+ 3 rows in set (0.05 sec) Bye

Read the article

Openfire and LDAP issues

- by clsmith

Thanks in advance for the help. Has anyone see this issue with openfire? Currently I use Openfire Fedora with Auth using windows 2003 and also use mysql for the database. When I bring up two clients and talk to each other the time is slow between messages. Sometimes it can take between 5-15 minutes for something sent to get to the person (this is with only two people on the openfire server). I ran a tcp dump using port 389 and see that the machine is running thousands of queries against ldap. When i plug it into wireshark I notice that it is transferring the entire contact list or checking on the status of the entire contact list ? When I run debug on openfire itself I am presented with only this small message in the log: 2010.06.08 07:01:17 LdapManager: Starting LDAP search... 2010.06.08 07:01:17 LdapManager: ... search finished 2010.06.08 07:01:17 LdapManager: Creating a DirContext in LdapManager.getContext()... 2010.06.08 07:01:17 LdapManager: Created hashtable with context values, attempting to create context... 2010.06.08 07:01:17 LdapManager: ... context created successfully, returning. 2010.06.08 07:01:17 LdapManager: Trying to find a groups's DN based on it's groupname. cn: Spark agents CLT, Base DN: OU="Hidden",DC="Hidden",DC="net"... 2010.06.08 07:01:17 LdapManager: Creating a DirContext in LdapManager.getContext()... 2010.06.08 07:01:17 LdapManager: Created hashtable with context values, attempting to create context... 2010.06.08 07:01:17 LdapManager: ... context created successfully, returning. 2010.06.08 07:01:17 LdapManager: Starting LDAP search... 2010.06.08 07:01:17 LdapManager: ... search finished 2010.06.08 07:01:17 LdapManager: Trying to find a groups's DN based on it's groupname. cn: Spark agents CLT, Base DN: OU="Hidden",DC="Hidden",DC="net"... 2010.06.08 07:01:17 LdapManager: Creating a DirContext in LdapManager.getContext()... 2010.06.08 07:01:17 LdapManager: Created hashtable with context values, attempting to create context... 2010.06.08 07:01:17 LdapManager: ... context created successfully, returning. 2010.06.08 07:01:17 LdapManager: Starting LDAP search... 2010.06.08 07:01:17 LdapManager: ... search finished I thought this was a configuration on my end and started to look into the cache settings on the openfire webpages. I tweaked the settings as recommend by the pages and still get the same issues. I doesnt seem to cache the contact list or this might be a feature never fixed or implemented. Has anyone gone through this before ? I have searched online and I see others have great experience with openfire with no issues like I have, or is it because noone checked the queries ? For the time being I created a new Domain Controller and moved openfire to that computer so it can run local queries. This seems to help reduce the speed alot, but when I run the server performance manager tool I see that with two people only using that openfire server I run 593.7 request per second. Thanks for your help, if I didnt provide enough data please let me know what you need and I can find it.

Read the article

Log a user in to an ASP.net application using Windows Authentication without using Windows Authentic

- by Rising Star

I have an ASP.net application I'm developing authentication for. I am using an existing cookie-based log on system to log users in to the system. The application runs as an anonymous account and then checks the cookie when the user wants to do something restricted. This is working fine. However, there is one caveat: I've been told that for each page that connects to our SQL server, I need to make it so that the user connects using an Active Directory account. because the system I'm using is cookie based, the user isn't logged in to Active Directory. Therefore, I use impersonation to connect to the server as a specific account. However, the powers that be here don't like impersonation; they say that it clutters up the code. I agree, but I've found no way around this. It seems that the only way that a user can be logged in to an ASP.net application is by either connecting with Internet Explorer from a machine where the user is logged in with their Active Directory account or by typing an Active Directory username and password. Neither of these two are workable in my application. I think it would be nice if I could make it so that when a user logs in and receives the cookie (which actually comes from a separate log on application, by the way), there could be some code run which tells the application to perform all network operations as the user's Active Directory account, just as if they had typed an Active Directory username and password. It seems like this ought to be possible somehow, but the solution evades me. How can I make this work? Update To those who have responded so far, I apologize for the confusion I have caused. The responses I've received indicate that you've misunderstood the question, so please allow me to clarify. I have no control over the requirement that users must perform network operations (such as SQL queries) using Active Directory accounts. I've been told several times (online and in meat-space) that this is an unusual requirement and possibly bad practice. I also have no control over the requirement that users must log in using the existing cookie-based log on application. I understand that in an ideal MS ecosystem, I would simply dis-allow anonymous access in my IIS settings and users would log in using Windows Authentication. This is not the case. The current system is that as far as IIS is concerned, the user logs in anonymously (even though they supply credentials which result in the issuance of a cookie) and we must programmatically check the cookie to see if the user has access to any restricted resources. In times past, we have simply used a single SQL account to perform all queries. My direct supervisor (who has many years of experience with this sort of thing) wants to change this. He says that if each user has his own AD account to perform SQL queries, it gives us more of a trail to follow if someone tries to do something wrong. The closest thing I've managed to come up with is using WIF to give the user a claim to a specific Active Directory account, but I still have to use impersonation because even still, the ASP.net process presents anonymous credentials to the SQL server. It boils down to this: Can I log users in with Active Directory accounts in my ASP.net application without having the users manually enter their AD credentials? (Windows Authentication)

Read the article

What are good CLI tools for JSON?

- by jasonmp85

General Problem Though I may be diagnosing the root cause of an event, determining how many users it affected, or distilling timing logs in order to assess the performance and throughput impact of a recent code change, my tools stay the same: grep, awk, sed, tr, uniq, sort, zcat, tail, head, join, and split. To glue them all together, Unix gives us pipes, and for fancier filtering we have xargs. If these fail me, there's always perl -e. These tools are perfect for processing CSV files, tab-delimited files, log files with a predictable line format, or files with comma-separated key-value pairs. In other words, files where each line has next to no context. XML Analogues I recently needed to trawl through Gigabytes of XML to build a histogram of usage by user. This was easy enough with the tools I had, but for more complicated queries the normal approaches break down. Say I have files with items like this: <foo user="me"> <baz key="zoidberg" value="squid" /> <baz key="leela" value="cyclops" /> <baz key="fry" value="rube" /> </foo> And let's say I want to produce a mapping from user to average number of <baz>s per <foo>. Processing line-by-line is no longer an option: I need to know which user's <foo> I'm currently inspecting so I know whose average to update. Any sort of Unix one liner that accomplishes this task is likely to be inscrutable. Fortunately in XML-land, we have wonderful technologies like XPath, XQuery, and XSLT to help us. Previously, I had gotten accustomed to using the wonderful XML::XPath Perl module to accomplish queries like the one above, but after finding a TextMate Plugin that could run an XPath expression against my current window, I stopped writing one-off Perl scripts to query XML. And I just found out about XMLStarlet which is installing as I type this and which I look forward to using in the future. JSON Solutions? So this leads me to my question: are there any tools like this for JSON? It's only a matter of time before some investigation task requires me to do similar queries on JSON files, and without tools like XPath and XSLT, such a task will be a lot harder. If I had a bunch of JSON that looked like this: { "firstName": "Bender", "lastName": "Robot", "age": 200, "address": { "streetAddress": "123", "city": "New York", "state": "NY", "postalCode": "1729" }, "phoneNumber": [ { "type": "home", "number": "666 555-1234" }, { "type": "fax", "number": "666 555-4567" } ] } And wanted to find the average number of phone numbers each person had, I could do something like this with XPath: fn:avg(/fn:count(phoneNumber)) Questions Are there any command-line tools that can "query" JSON files in this way? If you have to process a bunch of JSON files on a Unix command line, what tools do you use? Heck, is there even work being done to make a query language like this for JSON? If you do use tools like this in your day-to-day work, what do you like/dislike about them? Are there any gotchas? I'm noticing more and more data serialization is being done using JSON, so processing tools like this will be crucial when analyzing large data dumps in the future. Language libraries for JSON are very strong and it's easy enough to write scripts to do this sort of processing, but to really let people play around with the data shell tools are needed. Related Questions Grep and Sed Equivalent for XML Command Line Processing Is there a query language for JSON? JSONPath or other XPath like utility for JSON/Javascript; or Jquery JSON

Read the article

Javascript and Twitter API rate limitation? (Changing variable values in a loop)

- by Pablo

Hello, I have adapted an script from an example of http://github.com/remy/twitterlib. It´s a script that makes one query each 10 seconds to my Twitter timeline, to get only the messages that begin with a musical notation. It´s already working, but I don´t know it is the better way to do this... The Twitter API has a rate limit of 150 IP access per hour (queries from the same user). At this time, my Twitter API is blocked at 25 minutes because the 10 seconds frecuency between posts. If I set up a frecuency of 25 seconds between post, I am below the rate limit per hour, but the first 10 posts are shown so slowly. I think this way I can guarantee to be below the Twitter API rate limit and show the first 10 posts at normal speed: For the first 10 posts, I would like to set a frecuency of 5 seconds between queries. For the rest of the posts, I would like to set a frecuency of 25 seconds between queries. I think if making somewhere in the code a loop with the previous sentences, setting the "frecuency" value from 5000 to 25000 after the 10th query (or after 50 seconds, it´s the same), that´s it... Can you help me on modify this code below to make it work? Thank you in advance. var Queue = function (delay, callback) { var q = [], timer = null, processed = {}, empty = null, ignoreRT = twitterlib.filter.format('-"RT @"'); function process() { var item = null; if (q.length) { callback(q.shift()); } else { this.stop(); setTimeout(empty, 5000); } return this; } return { push: function (item) { var green = [], i; if (!(item instanceof Array)) { item = [item]; } if (timer == null && q.length == 0) { this.start(); } for (i = 0; i < item.length; i++) { if (!processed[item[i].id] && twitterlib.filter.match(item[i], ignoreRT)) { processed[item[i].id] = true; q.push(item[i]); } } q = q.sort(function (a, b) { return a.id > b.id; }); return this; }, start: function () { if (timer == null) { timer = setInterval(process, delay); } return this; }, stop: function () { clearInterval(timer); timer = null; return this; }, empty: function (fn) { empty = fn; return this; }, q: q, next: process }; }; $.extend($.expr[':'], { below: function (a, i, m) { var y = m[3]; return $(a).offset().top y; } }); function renderTweet(data) { var html = ''; html += ''; html += twitterlib.ify.clean(data.text); html += ''; since_id = data.id; return html; } function passToQueue(data) { if (data.length) { twitterQueue.push(data.reverse()); } } var frecuency = 10000; // The lapse between each new Queue var since_id = 1; var run = function () { twitterlib .timeline('twitteruser', { filter : "'?'", limit: 10 }, passToQueue) }; var twitterQueue = new Queue(frecuency, function (item) { var tweet = $(renderTweet(item)); var tweetClone = tweet.clone().hide().css({ visibility: 'hidden' }).prependTo('#tweets').slideDown(1000); tweet.css({ top: -200, position: 'absolute' }).prependTo('#tweets').animate({ top: 0 }, 1000, function () { tweetClone.css({ visibility: 'visible' }); $(this).remove(); }); $('#tweets p:below(' + window.innerHeight + ')').remove(); }).empty(run); run();

Read the article

Syntax error in INSERT INTO statement

- by user454563

I wrote a program that connects to MS Access. When I fill in the fields and add a new item to Access the program fails. The exception is "Syntax error in INSERT INTO statement" Here is the relevant code. **************************************************************** AdoHelper.cs **************************************************************** using System; using System.Collections.Generic; using System.Text; using System.Data; using System.Data.OleDb; namespace Yad2 { class AdoHelper { //get the connection string from the app.config file //Provider=Microsoft.ACE.OLEDB.12.0;Data Source=|DataDirectory|\Yad2.accdb static string connectionString = Properties.Settings.Default.DBConnection.ToString(); //declare the db connection static OleDbConnection con = new OleDbConnection(connectionString); /// <summary> /// To Execute queries which returns result set (table / relation) /// </summary> /// <param name="query">the query string</param> /// <returns></returns> public static DataTable ExecuteDataTable(string query) { try { con.Open(); OleDbCommand command = new OleDbCommand(query, con); System.Data.OleDb.OleDbDataAdapter tableAdapter = new System.Data.OleDb.OleDbDataAdapter(command); DataTable dt = new DataTable(); tableAdapter.Fill(dt); return dt; } catch (Exception ex) { throw ex; } finally { con.Close(); } } /// <summary> /// To Execute update / insert / delete queries /// </summary> /// <param name="query">the query string</param> public static void ExecuteNonQuery(string query) { try { con.Open(); System.Data.OleDb.OleDbCommand command = new System.Data.OleDb.OleDbCommand(query, con); command.ExecuteNonQuery(); } catch (Exception ex) { throw ex; } finally { con.Close(); } } /// <summary> /// To Execute queries which return scalar value /// </summary> /// <param name="query">the query string</param> public static object ExecuteScalar(string query) { try { con.Open(); System.Data.OleDb.OleDbCommand command = new System.Data.OleDb.OleDbCommand(query, con); /// here is the Excaption !!!!!!!!! return command.ExecuteScalar(); } catch { throw; } finally { con.Close(); } } } } **************************************************************************** **************************************************************************** DataQueries.cs **************************************************************************** using System; using System.Collections.Generic; using System.Text; using System.Data; namespace Yad2 { class DataQueries { public static DataTable GetAllItems() { try { string query = "Select * from Messages"; DataTable dt = AdoHelper.ExecuteDataTable(query); return dt; } catch (Exception ex) { throw ex; } } public static void AddNewItem(string mesNumber, string title , string mesDate , string contactMail , string mesType , string Details ) { string query = "Insert into Messages values(" + mesNumber + " , '" + title + "' , '" + mesDate + "' , '" + contactMail + "' , , '" + mesType + "' , '" + Details + "')"; AdoHelper.ExecuteNonQuery(query); } public static void DeleteDept(int mesNumber) { string query = "Delete from Item where MessageNumber=" + mesNumber; AdoHelper.ExecuteNonQuery(query); } } } *********************************************************************************************** plase help me .... why the program falls ?

Read the article

How can I remove this query from within a loop?

- by Chris

I am currently designing a forum as a personal project. One of the recurring issues I've come across is database queries in loops. I've managed to avoid doing that so far by using table joins or caching of data in arrays for later use. Right now though I've come across a situation where I'm not sure how I can write the code in such a way that I can use either of those methods easily. However I'd still prefer to do at most 2 queries for this operation rather than 1 + 1 per group of forums, which so far has resulted in 5 per page. So while 5 isn't a huge number (though it will increase for each forum group I add) it's the principle that's important to me here, I do NOT want to write queries in loops What I'm doing is displaying forum index groupings (eg admin forums, user forums etc) and then each forum within that group on a single page index, it's the combination of both in one page that's causing me issue. If it had just been a single group per page, I'd use a table join and problem solved. But if I use a table join here, although I can potentially get all the data I need it'll be in one mass of results and it needs displaying properly. Here's the code (I've removed some of the html for clarity) <?php $sql= "select * from forum_groups"; //query 1 $result1 = $database->query($sql); while($group = mysql_fetch_assoc($result1)) //first loop {?> <table class="threads"> <tr> <td class="forumgroupheader"> <?php echo $group['group_name']; ?> </td> </tr> <tr> <td class="forumgroupheader2"> <?php echo $group['group_desc']; ?> </td> </tr> </table> <table> <tr> <th class="thforum"> Forum Name</th> <th class="thforum"> Forum Decsription</th> <th class="thforum"> Last Post </th> <tr> <?php $group_id = $group['id']; $sql = "SELECT forums.id, forums.forum_group_id, forums.forum_name, forums.forum_desc, forums.visible_rank, forums.locked, forums.lock_rank, forums.topics, forums.posts, forums.last_post, forums.last_post_id, users.username FROM forums LEFT JOIN users on forums.last_post_id=users.id WHERE forum_group_id='{$group_id}'"; //query 2 $result2 = $database->query($sql); while($forum = mysql_fetch_assoc($result2)) //second loop {?> So how can I either a) write the SQL in such a way as to remove the second query from inside the loop or b) combine the results in an array either way I need to be able to access the data as an when so I can format it properly for the page output, ie within the loops still.

Read the article

Using LINQ Distinct: With an Example on ASP.NET MVC SelectListItem

- by Joe Mayo

One of the things that might be surprising in the LINQ Distinct standard query operator is that it doesn’t automatically work properly on custom classes. There are reasons for this, which I’ll explain shortly. The example I’ll use in this post focuses on pulling a unique list of names to load into a drop-down list. I’ll explain the sample application, show you typical first shot at Distinct, explain why it won’t work as you expect, and then demonstrate a solution to make Distinct work with any custom class. The technologies I’m using are LINQ to Twitter, LINQ to Objects, Telerik Extensions for ASP.NET MVC, ASP.NET MVC 2, and Visual Studio 2010. The function of the example program is to show a list of people that I follow. In Twitter API vernacular, these people are called “Friends”; though I’ve never met most of them in real life. This is part of the ubiquitous language of social networking, and Twitter in particular, so you’ll see my objects named accordingly. Where Distinct comes into play is because I want to have a drop-down list with the names of the friends appearing in the list. Some friends are quite verbose, which means I can’t just extract names from each tweet and populate the drop-down; otherwise, I would end up with many duplicate names. Therefore, Distinct is the appropriate operator to eliminate the extra entries from my friends who tend to be enthusiastic tweeters. The sample doesn’t do anything with the drop-down list and I leave that up to imagination for what it’s practical purpose could be; perhaps a filter for the list if I only want to see a certain person’s tweets or maybe a quick list that I plan to combine with a TextBox and Button to reply to a friend. When the program runs, you’ll need to authenticate with Twitter, because I’m using OAuth (DotNetOpenAuth), for authentication, and then you’ll see the drop-down list of names above the grid with the most recent tweets from friends. Here’s what the application looks like when it runs: As you can see, there is a drop-down list above the grid. The drop-down list is where most of the focus of this article will be. There is some description of the code before we talk about the Distinct operator, but we’ll get there soon. This is an ASP.NET MVC2 application, written with VS 2010. Here’s the View that produces this screen: <%@ Page Language="C#" MasterPageFile="~/Views/Shared/Site.Master" Inherits="System.Web.Mvc.ViewPage<TwitterFriendsViewModel>" %> <%@ Import Namespace="DistinctSelectList.Models" %> <asp:Content ID="Content1" ContentPlaceHolderID="TitleContent" runat="server"> Home Page </asp:Content><asp:Content ID="Content2" ContentPlaceHolderID="MainContent" runat="server"> <fieldset> <legend>Twitter Friends</legend> <div> <%= Html.DropDownListFor( twendVM => twendVM.FriendNames, Model.FriendNames, "<All Friends>") %> </div> <div> <% Html.Telerik().Grid<TweetViewModel>(Model.Tweets) .Name("TwitterFriendsGrid") .Columns(cols => { cols.Template(col => { %> <img src="<%= col.ImageUrl %>" alt="<%= col.ScreenName %>" /> <% }); cols.Bound(col => col.ScreenName); cols.Bound(col => col.Tweet); }) .Render(); %> </div> </fieldset> </asp:Content> As shown above, the Grid is from Telerik’s Extensions for ASP.NET MVC. The first column is a template that renders the user’s Avatar from a URL provided by the Twitter query. Both the Grid and DropDownListFor display properties that are collections from a TwitterFriendsViewModel class, shown below: using System.Collections.Generic; using System.Web.Mvc; namespace DistinctSelectList.Models { /// /// For finding friend info on screen /// public class TwitterFriendsViewModel { /// /// Display names of friends in drop-down list /// public List FriendNames { get; set; } /// /// Display tweets in grid /// public List Tweets { get; set; } } } I created the TwitterFreindsViewModel. The two Lists are what the View consumes to populate the DropDownListFor and Grid. Notice that FriendNames is a List of SelectListItem, which is an MVC class. Another custom class I created is the TweetViewModel (the type of the Tweets List), shown below: namespace DistinctSelectList.Models { /// /// Info on friend tweets /// public class TweetViewModel { /// /// User's avatar /// public string ImageUrl { get; set; } /// /// User's Twitter name /// public string ScreenName { get; set; } /// /// Text containing user's tweet /// public string Tweet { get; set; } } } The initial Twitter query returns much more information than we need for our purposes and this a special class for displaying info in the View. Now you know about the View and how it’s constructed. Let’s look at the controller next. The controller for this demo performs authentication, data retrieval, data manipulation, and view selection. I’ll skip the description of the authentication because it’s a normal part of using OAuth with LINQ to Twitter. Instead, we’ll drill down and focus on the Distinct operator. However, I’ll show you the entire controller, below, so that you can see how it all fits together: using System.Linq; using System.Web.Mvc; using DistinctSelectList.Models; using LinqToTwitter; namespace DistinctSelectList.Controllers { [HandleError] public class HomeController : Controller { private MvcOAuthAuthorization auth; private TwitterContext twitterCtx; /// /// Display a list of friends current tweets /// /// public ActionResult Index() { auth = new MvcOAuthAuthorization(InMemoryTokenManager.Instance, InMemoryTokenManager.AccessToken); string accessToken = auth.CompleteAuthorize(); if (accessToken != null) { InMemoryTokenManager.AccessToken = accessToken; } if (auth.CachedCredentialsAvailable) { auth.SignOn(); } else { return auth.BeginAuthorize(); } twitterCtx = new TwitterContext(auth); var friendTweets = (from tweet in twitterCtx.Status where tweet.Type == StatusType.Friends select new TweetViewModel { ImageUrl = tweet.User.ProfileImageUrl, ScreenName = tweet.User.Identifier.ScreenName, Tweet = tweet.Text }) .ToList(); var friendNames = (from tweet in friendTweets select new SelectListItem { Text = tweet.ScreenName, Value = tweet.ScreenName }) .Distinct() .ToList(); var twendsVM = new TwitterFriendsViewModel { Tweets = friendTweets, FriendNames = friendNames }; return View(twendsVM); } public ActionResult About() { return View(); } } } The important part of the listing above are the LINQ to Twitter queries for friendTweets and friendNames. Both of these results are used in the subsequent population of the twendsVM instance that is passed to the view. Let’s dissect these two statements for clarification and focus on what is happening with Distinct. The query for friendTweets gets a list of the 20 most recent tweets (as specified by the Twitter API for friend queries) and performs a projection into the custom TweetViewModel class, repeated below for your convenience: var friendTweets = (from tweet in twitterCtx.Status where tweet.Type == StatusType.Friends select new TweetViewModel { ImageUrl = tweet.User.ProfileImageUrl, ScreenName = tweet.User.Identifier.ScreenName, Tweet = tweet.Text }) .ToList(); The LINQ to Twitter query above simplifies what we need to work with in the View and the reduces the amount of information we have to look at in subsequent queries. Given the friendTweets above, the next query performs another projection into an MVC SelectListItem, which is required for binding to the DropDownList. This brings us to the focus of this blog post, writing a correct query that uses the Distinct operator. The query below uses LINQ to Objects, querying the friendTweets collection to get friendNames: var friendNames = (from tweet in friendTweets select new SelectListItem { Text = tweet.ScreenName, Value = tweet.ScreenName }) .Distinct() .ToList(); The above implementation of Distinct seems normal, but it is deceptively incorrect. After running the query above, by executing the application, you’ll notice that the drop-down list contains many duplicates. This will send you back to the code scratching your head, but there’s a reason why this happens. To understand the problem, we must examine how Distinct works in LINQ to Objects. Distinct has two overloads: one without parameters, as shown above, and another that takes a parameter of type IEqualityComparer<T>. In the case above, no parameters, Distinct will call EqualityComparer<T>.Default behind the scenes to make comparisons as it iterates through the list. You don’t have problems with the built-in types, such as string, int, DateTime, etc, because they all implement IEquatable<T>. However, many .NET Framework classes, such as SelectListItem, don’t implement IEquatable<T>. So, what happens is that EqualityComparer<T>.Default results in a call to Object.Equals, which performs reference equality on reference type objects. You don’t have this problem with value types because the default implementation of Object.Equals is bitwise equality. However, most of your projections that use Distinct are on classes, just like the SelectListItem used in this demo application. So, the reason why Distinct didn’t produce the results we wanted was because we used a type that doesn’t define its own equality and Distinct used the default reference equality. This resulted in all objects being included in the results because they are all separate instances in memory with unique references. As you might have guessed, the solution to the problem is to use the second overload of Distinct that accepts an IEqualityComparer<T> instance. If you were projecting into your own custom type, you could make that type implement IEqualityComparer<T>, but SelectListItem belongs to the .NET Framework Class Library. Therefore, the solution is to create a custom type to implement IEqualityComparer<T>, as in the SelectListItemComparer class, shown below: using System.Collections.Generic; using System.Web.Mvc; namespace DistinctSelectList.Models { public class SelectListItemComparer : EqualityComparer { public override bool Equals(SelectListItem x, SelectListItem y) { return x.Value.Equals(y.Value); } public override int GetHashCode(SelectListItem obj) { return obj.Value.GetHashCode(); } } } The SelectListItemComparer class above doesn’t implement IEqualityComparer<SelectListItem>, but rather derives from EqualityComparer<SelectListItem>. Microsoft recommends this approach for consistency with the behavior of generic collection classes. However, if your custom type already derives from a base class, go ahead and implement IEqualityComparer<T>, which will still work. EqualityComparer is an abstract class, that implements IEqualityComparer<T> with Equals and GetHashCode abstract methods. For the purposes of this application, the SelectListItem.Value property is sufficient to determine if two items are equal. Since SelectListItem.Value is type string, the code delegates equality to the string class. The code also delegates the GetHashCode operation to the string class.You might have other criteria in your own object and would need to define what it means for your object to be equal. Now that we have an IEqualityComparer<SelectListItem>, let’s fix the problem. The code below modifies the query where we want distinct values: var friendNames = (from tweet in friendTweets select new SelectListItem { Text = tweet.ScreenName, Value = tweet.ScreenName }) .Distinct(new SelectListItemComparer()) .ToList(); Notice how the code above passes a new instance of SelectListItemComparer as the parameter to the Distinct operator. Now, when you run the application, the drop-down list will behave as you expect, showing only a unique set of names. In addition to Distinct, other LINQ Standard Query Operators have overloads that accept IEqualityComparer<T>’s, You can use the same techniques as shown here, with SelectListItemComparer, with those other operators as well. Now you know how to resolve problems with getting Distinct to work properly and also have a way to fix problems with other operators that require equality comparisons. @JoeMayo

Read the article

Faceted search with Solr on Windows

- by Dr.NETjes

With over 10 million hits a day, funda.nl is probably the largest ASP.NET website which uses Solr on a Windows platform. While all our data (i.e. real estate properties) is stored in SQL Server, we're using Solr 1.4.1 to return the faceted search results as fast as we can.And yes, Solr is very fast. We did do some heavy stress testing on our Solr service, which allowed us to do over 1,000 req/sec on a single 64-bits Solr instance; and that's including converting search-url's to Solr http-queries and deserializing Solr's result-XML back to .NET objects! Let me tell you about faceted search and how to integrate Solr in a .NET/Windows environment. I'll bet it's easier than you think :-) What is faceted search? Faceted search is the clustering of search results into categories, allowing users to drill into search results. By showing the number of hits for each facet category, users can easily see how many results match that category. If you're still a bit confused, this example from CNET explains it all: The SQL solution for faceted search Our ("pre-Solr") solution for faceted search was done by adding a lot of redundant columns to our SQL tables and doing a COUNT(...) for each of those columns: So if a user was searching for real estate properties in the city 'Amsterdam', our facet-query would be something like: SELECT COUNT(hasGarden), COUNT(has2Bathrooms), COUNT(has3Bathrooms), COUNT(etc...) FROM Houses WHERE city = 'Amsterdam' While this solution worked fine for a couple of years, it wasn't very easy for developers to add new facets. And also, performing COUNT's on all matched rows only performs well if you have a limited amount of rows in a table (i.e. less than a million). Enter Solr "Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface." (quoted from Wikipedia's page on Solr) Solr isn't a database, it's more like a big index. Every time you upload data to Solr, it will analyze the data and create an inverted index from it (like the index-pages of a book). This way Solr can lookup data very quickly. To explain the inner workings of Solr is beyond the scope of this post, but if you want to learn more, please visit the Solr Wiki pages. Getting faceted search results from Solr is very easy; first let me show you how to send a http-query to Solr: http://localhost:8983/solr/select?q=city:Amsterdam This will return an XML document containing the search results (in this example only three houses in the city of Amsterdam): <response> <result name="response" numFound="3" start="0"> <doc> <long name="id">3203</long> <str name="city">Amsterdam</str> <str name="steet">Keizersgracht</str> <int name="numberOfBathrooms">2</int> </doc> <doc> <long name="id">3205</long> <str name="city">Amsterdam</str> <str name="steet">Vondelstraat</str> <int name="numberOfBathrooms">3</int> </doc> <doc> <long name="id">4293</long> <str name="city">Amsterdam</str> <str name="steet">Wibautstraat</str> <int name="numberOfBathrooms">2</int> </doc> </result> </response> By adding a facet-querypart for the field "numberOfBathrooms", Solr will return the facets for this particular field. We will see that there's one house in Amsterdam with three bathrooms and two houses with two bathrooms. http://localhost:8983/solr/select?q=city:Amsterdam&facet=true&facet.field=numberOfBathrooms The complete XML response from Solr now looks like: <response> <result name="response" numFound="3" start="0"> <doc> <long name="id">3203</long> <str name="city">Amsterdam</str> <str name="steet">Keizersgracht</str> <int name="numberOfBathrooms">2</int> </doc> <doc> <long name="id">3205</long> <str name="city">Amsterdam</str> <str name="steet">Vondelstraat</str> <int name="numberOfBathrooms">3</int> </doc> <doc> <long name="id">4293</long> <str name="city">Amsterdam</str> <str name="steet">Wibautstraat</str> <int name="numberOfBathrooms">2</int> </doc> </result> <lst name="facet_fields"> <lst name="numberOfBathrooms"> <int name="2">2</int> <int name="3">1</int> </lst> </lst> </response> Trying Solr for yourself To run Solr on your local machine and experiment with it, you should read the Solr tutorial. This tutorial really only takes 1 hour, in which you install Solr, upload sample data and get some query results. And yes, it works on Windows without a problem. Note that in the Solr tutorial, you're using Jetty as a Java Servlet Container (that's why you must start it using "java -jar start.jar"). In our environment we prefer to use Apache Tomcat to host Solr, which installs like a Windows service and works more like .NET developers expect. See the SolrTomcat page.Some best practices for running Solr on Windows: Use the 64-bits version of Tomcat. In our tests, this doubled the req/sec we were able to handle!Use a .NET XmlReader to convert Solr's XML output-stream to .NET objects. Don't use XPath; it won't scale well.Use filter queries ("fq" parameter) instead of the normal "q" parameter where possible. Filter queries are cached by Solr and will speed up Solr's response time (see FilterQueryGuidance)In my next post I’ll talk about how to keep Solr's indexed data in sync with the data in your SQL tables. Timestamps / rowversions will help you out here!

Read the article

SQL SERVER – Guest Posts – Feodor Georgiev – The Context of Our Database Environment – Going Beyond the Internal SQL Server Waits – Wait Type – Day 21 of 28

- by pinaldave

This guest post is submitted by Feodor. Feodor Georgiev is a SQL Server database specialist with extensive experience of thinking both within and outside the box. He has wide experience of different systems and solutions in the fields of architecture, scalability, performance, etc. Feodor has experience with SQL Server 2000 and later versions, and is certified in SQL Server 2008. In this article Feodor explains the server-client-server process, and concentrated on the mutual waits between client and SQL Server. This is essential in grasping the concept of waits in a ‘global’ application plan. Recently I was asked to write a blog post about the wait statistics in SQL Server and since I had been thinking about writing it for quite some time now, here it is. It is a wide-spread idea that the wait statistics in SQL Server will tell you everything about your performance. Well, almost. Or should I say – barely. The reason for this is that SQL Server is always a part of a bigger system – there are always other players in the game: whether it is a client application, web service, any other kind of data import/export process and so on. In short, the SQL Server surroundings look like this: This means that SQL Server, aside from its internal waits, also depends on external waits and settings. As we can see in the picture above, SQL Server needs to have an interface in order to communicate with the surrounding clients over the network. For this communication, SQL Server uses protocol interfaces. I will not go into detail about which protocols are best, but you can read this article. Also, review the information about the TDS (Tabular data stream). As we all know, our system is only as fast as its slowest component. This means that when we look at our environment as a whole, the SQL Server might be a victim of external pressure, no matter how well we have tuned our database server performance. Let’s dive into an example: let’s say that we have a web server, hosting a web application which is using data from our SQL Server, hosted on another server. The network card of the web server for some reason is malfunctioning (think of a hardware failure, driver failure, or just improper setup) and does not send/receive data faster than 10Mbs. On the other end, our SQL Server will not be able to send/receive data at a faster rate either. This means that the application users will notify the support team and will say: “My data is coming very slow.” Now, let’s move on to a bit more exciting example: imagine that there is a similar setup as the example above – one web server and one database server, and the application is not using any stored procedure calls, but instead for every user request the application is sending 80kb query over the network to the SQL Server. (I really thought this does not happen in real life until I saw it one day.) So, what happens in this case? To make things worse, let’s say that the 80kb query text is submitted from the application to the SQL Server at least 100 times per minute, and as often as 300 times per minute in peak times. Here is what happens: in order for this query to reach the SQL Server, it will have to be broken into a of number network packets (according to the packet size settings) – and will travel over the network. On the other side, our SQL Server network card will receive the packets, will pass them to our network layer, the packets will get assembled, and eventually SQL Server will start processing the query – parsing, allegorizing, generating the query execution plan and so on. So far, we have already had a serious network overhead by waiting for the packets to reach our Database Engine. There will certainly be some processing overhead – until the database engine deals with the 80kb query and its 20 subqueries. The waits you see in the DMVs are actually collected from the point the query reaches the SQL Server and the packets are assembled. Let’s say that our query is processed and it finally returns 15000 rows. These rows have a certain size as well, depending on the data types returned. This means that the data will have converted to packages (depending on the network size package settings) and will have to reach the application server. There will also be waits, however, this time you will be able to see a wait type in the DMVs called ASYNC_NETWORK_IO. What this wait type indicates is that the client is not consuming the data fast enough and the network buffers are filling up. Recently Pinal Dave posted a blog on Client Statistics. What Client Statistics does is captures the physical flow characteristics of the query between the client(Management Studio, in this case) and the server and back to the client. As you see in the image, there are three categories: Query Profile Statistics, Network Statistics and Time Statistics. Number of server roundtrips–a roundtrip consists of a request sent to the server and a reply from the server to the client. For example, if your query has three select statements, and they are separated by ‘GO’ command, then there will be three different roundtrips. TDS Packets sent from the client – TDS (tabular data stream) is the language which SQL Server speaks, and in order for applications to communicate with SQL Server, they need to pack the requests in TDS packets. TDS Packets sent from the client is the number of packets sent from the client; in case the request is large, then it may need more buffers, and eventually might even need more server roundtrips. TDS packets received from server –is the TDS packets sent by the server to the client during the query execution. Bytes sent from client – is the volume of the data set to our SQL Server, measured in bytes; i.e. how big of a query we have sent to the SQL Server. This is why it is best to use stored procedures, since the reusable code (which already exists as an object in the SQL Server) will only be called as a name of procedure + parameters, and this will minimize the network pressure. Bytes received from server – is the amount of data the SQL Server has sent to the client, measured in bytes. Depending on the number of rows and the datatypes involved, this number will vary. But still, think about the network load when you request data from SQL Server. Client processing time – is the amount of time spent in milliseconds between the first received response packet and the last received response packet by the client. Wait time on server replies – is the time in milliseconds between the last request packet which left the client and the first response packet which came back from the server to the client. Total execution time – is the sum of client processing time and wait time on server replies (the SQL Server internal processing time) Here is an illustration of the Client-server communication model which should help you understand the mutual waits in a client-server environment. Keep in mind that a query with a large ‘wait time on server replies’ means the server took a long time to produce the very first row. This is usual on queries that have operators that need the entire sub-query to evaluate before they proceed (for example, sort and top operators). However, a query with a very short ‘wait time on server replies’ means that the query was able to return the first row fast. However a long ‘client processing time’ does not necessarily imply the client spent a lot of time processing and the server was blocked waiting on the client. It can simply mean that the server continued to return rows from the result and this is how long it took until the very last row was returned. The bottom line is that developers and DBAs should work together and think carefully of the resource utilization in the client-server environment. From experience I can say that so far I have seen only cases when the application developers and the Database developers are on their own and do not ask questions about the other party’s world. I would recommend using the Client Statistics tool during new development to track the performance of the queries, and also to find a synchronous way of utilizing resources between the client – server – client. Here is another example: think about similar setup as above, but add another server to the game. Let’s say that we keep our media on a separate server, and together with the data from our SQL Server we need to display some images on the webpage requested by our user. No matter how simple or complicated the logic to get the images is, if the images are 500kb each our users will get the page slowly and they will still think that there is something wrong with our data. Anyway, I don’t mean to get carried away too far from SQL Server. Instead, what I would like to say is that DBAs should also be aware of ‘the big picture’. I wrote a blog post a while back on this topic, and if you are interested, you can read it here about the big picture. And finally, here are some guidelines for monitoring the network performance and improving it: Run a trace and outline all queries that return more than 1000 rows (in Profiler you can actually filter and sort the captured trace by number of returned rows). This is not a set number; it is more of a guideline. The general thought is that no application user can consume that many rows at once. Ask yourself and your fellow-developers: ‘why?’. Monitor your network counters in Perfmon: Network Interface:Output queue length, Redirector:Network errors/sec, TCPv4: Segments retransmitted/sec and so on. Make sure to establish a good friendship with your network administrator (buy them coffee, for example J ) and get into a conversation about the network settings. Have them explain to you how the network cards are setup – are they standalone, are they ‘teamed’, what are the settings – full duplex and so on. Find some time to read a bit about networking. In this short blog post I hope I have turned your attention to ‘the big picture’ and the fact that there are other factors affecting our SQL Server, aside from its internal workings. As a further reading I would still highly recommend the Wait Stats series on this blog, also I would recommend you have the coffee break conversation with your network admin as soon as possible. This guest post is written by Feodor Georgiev. Read all the post in the Wait Types and Queue series. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, Readers Contribution, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL

Read the article

Spooling in SQL execution plans

- by Rob Farley

Sewing has never been my thing. I barely even know the terminology, and when discussing this with American friends, I even found out that half the words that Americans use are different to the words that English and Australian people use. That said – let’s talk about spools! In particular, the Spool operators that you find in some SQL execution plans. This post is for T-SQL Tuesday, hosted this month by me! I’ve chosen to write about spools because they seem to get a bad rap (even in my song I used the line “There’s spooling from a CTE, they’ve got recursion needlessly”). I figured it was worth covering some of what spools are about, and hopefully explain why they are remarkably necessary, and generally very useful. If you have a look at the Books Online page about Plan Operators, at http://msdn.microsoft.com/en-us/library/ms191158.aspx, and do a search for the word ‘spool’, you’ll notice it says there are 46 matches. 46! Yeah, that’s what I thought too... Spooling is mentioned in several operators: Eager Spool, Lazy Spool, Index Spool (sometimes called a Nonclustered Index Spool), Row Count Spool, Spool, Table Spool, and Window Spool (oh, and Cache, which is a special kind of spool for a single row, but as it isn’t used in SQL 2012, I won’t describe it any further here). Spool, Table Spool, Index Spool, Window Spool and Row Count Spool are all physical operators, whereas Eager Spool and Lazy Spool are logical operators, describing the way that the other spools work. For example, you might see a Table Spool which is either Eager or Lazy. A Window Spool can actually act as both, as I’ll mention in a moment. In sewing, cotton is put onto a spool to make it more useful. You might buy it in bulk on a cone, but if you’re going to be using a sewing machine, then you quite probably want to have it on a spool or bobbin, which allows it to be used in a more effective way. This is the picture that I want you to think about in relation to your data. I’m sure you use spools every time you use your sewing machine. I know I do. I can’t think of a time when I’ve got out my sewing machine to do some sewing and haven’t used a spool. However, I often run SQL queries that don’t use spools. You see, the data that is consumed by my query is typically in a useful state without a spool. It’s like I can just sew with my cotton despite it not being on a spool! Many of my favourite features in T-SQL do like to use spools though. This looks like a very similar query to before, but includes an OVER clause to return a column telling me the number of rows in my data set. I’ll describe what’s going on in a few paragraphs’ time. So what does a Spool operator actually do? The spool operator consumes a set of data, and stores it in a temporary structure, in the tempdb database. This structure is typically either a Table (ie, a heap), or an Index (ie, a b-tree). If no data is actually needed from it, then it could also be a Row Count spool, which only stores the number of rows that the spool operator consumes. A Window Spool is another option if the data being consumed is tightly linked to windows of data, such as when the ROWS/RANGE clause of the OVER clause is being used. You could maybe think about the type of spool being like whether the cotton is going onto a small bobbin to fit in the base of the sewing machine, or whether it’s a larger spool for the top. A Table or Index Spool is either Eager or Lazy in nature. Eager and Lazy are Logical operators, which talk more about the behaviour, rather than the physical operation. If I’m sewing, I can either be all enthusiastic and get all my cotton onto the spool before I start, or I can do it as I need it. “Lazy” might not the be the best word to describe a person – in the SQL world it describes the idea of either fetching all the rows to build up the whole spool when the operator is called (Eager), or populating the spool only as it’s needed (Lazy). Window Spools are both physical and logical. They’re eager on a per-window basis, but lazy between windows. And when is it needed? The way I see it, spools are needed for two reasons. 1 – When data is going to be needed AGAIN. 2 – When data needs to be kept away from the original source. If you’re someone that writes long stored procedures, you are probably quite aware of the second scenario. I see plenty of stored procedures being written this way – where the query writer populates a temporary table, so that they can make updates to it without risking the original table. SQL does this too. Imagine I’m updating my contact list, and some of my changes move data to later in the book. If I’m not careful, I might update the same row a second time (or even enter an infinite loop, updating it over and over). A spool can make sure that I don’t, by using a copy of the data. This problem is known as the Halloween Effect (not because it’s spooky, but because it was discovered in late October one year). As I’m sure you can imagine, the kind of spool you’d need to protect against the Halloween Effect would be eager, because if you’re only handling one row at a time, then you’re not providing the protection... An eager spool will block the flow of data, waiting until it has fetched all the data before serving it up to the operator that called it. In the query below I’m forcing the Query Optimizer to use an index which would be upset if the Name column values got changed, and we see that before any data is fetched, a spool is created to load the data into. This doesn’t stop the index being maintained, but it does mean that the index is protected from the changes that are being done. There are plenty of times, though, when you need data repeatedly. Consider the query I put above. A simple join, but then counting the number of rows that came through. The way that this has executed (be it ideal or not), is to ask that a Table Spool be populated. That’s the Table Spool operator on the top row. That spool can produce the same set of rows repeatedly. This is the behaviour that we see in the bottom half of the plan. In the bottom half of the plan, we see that the a join is being done between the rows that are being sourced from the spool – one being aggregated and one not – producing the columns that we need for the query. Table v Index When considering whether to use a Table Spool or an Index Spool, the question that the Query Optimizer needs to answer is whether there is sufficient benefit to storing the data in a b-tree. The idea of having data in indexes is great, but of course there is a cost to maintaining them. Here we’re creating a temporary structure for data, and there is a cost associated with populating each row into its correct position according to a b-tree, as opposed to simply adding it to the end of the list of rows in a heap. Using a b-tree could even result in page-splits as the b-tree is populated, so there had better be a reason to use that kind of structure. That all depends on how the data is going to be used in other parts of the plan. If you’ve ever thought that you could use a temporary index for a particular query, well this is it – and the Query Optimizer can do that if it thinks it’s worthwhile. It’s worth noting that just because a Spool is populated using an Index Spool, it can still be fetched using a Table Spool. The details about whether or not a Spool used as a source shows as a Table Spool or an Index Spool is more about whether a Seek predicate is used, rather than on the underlying structure. Recursive CTE I’ve already shown you an example of spooling when the OVER clause is used. You might see them being used whenever you have data that is needed multiple times, and CTEs are quite common here. With the definition of a set of data described in a CTE, if the query writer is leveraging this by referring to the CTE multiple times, and there’s no simplification to be leveraged, a spool could theoretically be used to avoid reapplying the CTE’s logic. Annoyingly, this doesn’t happen. Consider this query, which really looks like it’s using the same data twice. I’m creating a set of data (which is completely deterministic, by the way), and then joining it back to itself. There seems to be no reason why it shouldn’t use a spool for the set described by the CTE, but it doesn’t. On the other hand, if we don’t pull as many columns back, we might see a very different plan. You see, CTEs, like all sub-queries, are simplified out to figure out the best way of executing the whole query. My example is somewhat contrived, and although there are plenty of cases when it’s nice to give the Query Optimizer hints about how to execute queries, it usually doesn’t do a bad job, even without spooling (and you can always use a temporary table). When recursion is used, though, spooling should be expected. Consider what we’re asking for in a recursive CTE. We’re telling the system to construct a set of data using an initial query, and then use set as a source for another query, piping this back into the same set and back around. It’s very much a spool. The analogy of cotton is long gone here, as the idea of having a continual loop of cotton feeding onto a spool and off again doesn’t quite fit, but that’s what we have here. Data is being fed onto the spool, and getting pulled out a second time when the spool is used as a source. (This query is running on AdventureWorks, which has a ManagerID column in HumanResources.Employee, not AdventureWorks2012) The Index Spool operator is sucking rows into it – lazily. It has to be lazy, because at the start, there’s only one row to be had. However, as rows get populated onto the spool, the Table Spool operator on the right can return rows when asked, ending up with more rows (potentially) getting back onto the spool, ready for the next round. (The Assert operator is merely checking to see if we’ve reached the MAXRECURSION point – it vanishes if you use OPTION (MAXRECURSION 0), which you can try yourself if you like). Spools are useful. Don’t lose sight of that. Every time you use temporary tables or table variables in a stored procedure, you’re essentially doing the same – don’t get upset at the Query Optimizer for doing so, even if you think the spool looks like an expensive part of the query. I hope you’re enjoying this T-SQL Tuesday. Why not head over to my post that is hosting it this month to read about some other plan operators? At some point I’ll write a summary post – once I have you should find a comment below pointing at it. @rob_farley

Read the article

Performance Enhancement in Full-Text Search Query

- by Calvin Sun

Ever since its first release, we are continuing consolidating and developing InnoDB Full-Text Search feature. There is one recent improvement that worth blogging about. It is an effort with MySQL Optimizer team that simplifies some common queries’ Query Plans and dramatically shorted the query time. I will describe the issue, our solution and the end result by some performance numbers to demonstrate our efforts in continuing enhancement the Full-Text Search capability. The Issue: As we had discussed in previous Blogs, InnoDB implements Full-Text index as reversed auxiliary tables. The query once parsed will be reinterpreted into several queries into related auxiliary tables and then results are merged and consolidated to come up with the final result. So at the end of the query, we’ll have all matching records on hand, sorted by their ranking or by their Doc IDs. Unfortunately, MySQL’s optimizer and query processing had been initially designed for MyISAM Full-Text index, and sometimes did not fully utilize the complete result package from InnoDB. Here are a couple examples: Case 1: Query result ordered by Rank with only top N results: mysql> SELECT FTS_DOC_ID, MATCH (title, body) AGAINST ('database') AS SCORE FROM articles ORDER BY score DESC LIMIT 1; In this query, user tries to retrieve a single record with highest ranking. It should have a quick answer once we have all the matching documents on hand, especially if there are ranked. However, before this change, MySQL would almost retrieve rankings for almost every row in the table, sort them and them come with the top rank result. This whole retrieve and sort is quite unnecessary given the InnoDB already have the answer. In a real life case, user could have millions of rows, so in the old scheme, it would retrieve millions of rows' ranking and sort them, even if our FTS already found there are two 3 matched rows. Apparently, the million ranking retrieve is done in vain. In above case, it should just ask for 3 matched rows' ranking, all other rows' ranking are 0. If it want the top ranking, then it can just get the first record from our already sorted result. Case 2: Select Count(*) on matching records: mysql> SELECT COUNT(*) FROM articles WHERE MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE); In this case, InnoDB search can find matching rows quickly and will have all matching rows. However, before our change, in the old scheme, every row in the table was requested by MySQL one by one, just to check whether its ranking is larger than 0, and later comes up a count. In fact, there is no need for MySQL to fetch all rows, instead InnoDB already had all the matching records. The only thing need is to call an InnoDB API to retrieve the count The difference can be huge. Following query output shows how big the difference can be: mysql> select count(*) from searchindex_inno where match(si_title, si_text) against ('people') +----------+ | count(*) | +----------+ | 666877 | +----------+ 1 row in set (16 min 17.37 sec) So the query took almost 16 minutes. Let’s see how long the InnoDB can come up the result. In InnoDB, you can obtain extra diagnostic printout by turning on “innodb_ft_enable_diag_print”, this will print out extra query info: Error log: keynr=2, 'people' NL search Total docs: 10954826 Total words: 0 UNION: Searching: 'people' Processing time: 2 secs: row(s) 666877: error: 10 ft_init() ft_init_ext() keynr=2, 'people' NL search Total docs: 10954826 Total words: 0 UNION: Searching: 'people' Processing time: 3 secs: row(s) 666877: error: 10 Output shows it only took InnoDB only 3 seconds to get the result, while the whole query took 16 minutes to finish. So large amount of time has been wasted on the un-needed row fetching. The Solution: The solution is obvious. MySQL can skip some of its steps, optimize its plan and obtain useful information directly from InnoDB. Some of savings from doing this include: 1) Avoid redundant sorting. Since InnoDB already sorted the result according to ranking. MySQL Query Processing layer does not need to sort to get top matching results. 2) Avoid row by row fetching to get the matching count. InnoDB provides all the matching records. All those not in the result list should all have ranking of 0, and no need to be retrieved. And InnoDB has a count of total matching records on hand. No need to recount. 3) Covered index scan. InnoDB results always contains the matching records' Document ID and their ranking. So if only the Document ID and ranking is needed, there is no need to go to user table to fetch the record itself. 4) Narrow the search result early, reduce the user table access. If the user wants to get top N matching records, we do not need to fetch all matching records from user table. We should be able to first select TOP N matching DOC IDs, and then only fetch corresponding records with these Doc IDs. Performance Results and comparison with MyISAM The result by this change is very obvious. I includes six testing result performed by Alexander Rubin just to demonstrate how fast the InnoDB query now becomes when comparing MyISAM Full-Text Search. These tests are base on the English Wikipedia data of 5.4 Million rows and approximately 16G table. The test was performed on a machine with 1 CPU Dual Core, SSD drive, 8G of RAM and InnoDB_buffer_pool is set to 8 GB. Table 1: SELECT with LIMIT CLAUSE mysql> SELECT si_title, match(si_title, si_text) against('family') as rel FROM si WHERE match(si_title, si_text) against('family') ORDER BY rel desc LIMIT 10; InnoDB MyISAM Times Faster Time for the query 1.63 sec 3 min 26.31 sec 127 You can see for this particular query (retrieve top 10 records), InnoDB Full-Text Search is now approximately 127 times faster than MyISAM. Table 2: SELECT COUNT QUERY mysql>select count(*) from si where match(si_title, si_text) against('family‘); +----------+ | count(*) | +----------+ | 293955 | +----------+ InnoDB MyISAM Times Faster Time for the query 1.35 sec 28 min 59.59 sec 1289 In this particular case, where there are 293k matching results, InnoDB took only 1.35 second to get all of them, while take MyISAM almost half an hour, that is about 1289 times faster!. Table 3: SELECT ID with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, match(si_title, si_text) against(<TERM>) as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.5 sec 5.05 sec 10.1 family film 0.95 sec 25.39 sec 26.7 Pizza restaurant orange county California 0.93 sec 32.03 sec 34.4 President united states of America 2.5 sec 36.98 sec 14.8 Table 4: SELECT title and text with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, si_title, si_text, ... as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.61 sec 41.65 sec 68.3 family film 1.15 sec 47.17 sec 41.0 Pizza restaurant orange county california 1.03 sec 48.2 sec 46.8 President united states of america 2.49 sec 44.61 sec 17.9 Table 5: SELECT ID with ORDER BY and LIMIT CLAUSE for selected terms mysql> SELECT <ID>, match(si_title, si_text) against(<TERM>) as rel FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) ORDER BY rel desc LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.5 sec 5.05 sec 10.1 family film 0.95 sec 25.39 sec 26.7 Pizza restaurant orange county califormia 0.93 sec 32.03 sec 34.4 President united states of america 2.5 sec 36.98 sec 14.8 Table 6: SELECT COUNT(*) mysql> SELECT count(*) FROM si_<TB> WHERE match(si_title, si_text) against (<TERM>) LIMIT 10; Term InnoDB (time to execute) MyISAM(time to execute) Times Faster family 0.47 sec 82 sec 174.5 family film 0.83 sec 131 sec 157.8 Pizza restaurant orange county califormia 0.74 sec 106 sec 143.2 President united states of america 1.96 sec 220 sec 112.2 Again, table 3 to table 6 all showing InnoDB consistently outperform MyISAM in these queries by a large margin. It becomes obvious the InnoDB has great advantage over MyISAM in handling large data search. Summary: These results demonstrate the great performance we could achieve by making MySQL optimizer and InnoDB Full-Text Search more tightly coupled. I think there are still many cases that InnoDB’s result info have not been fully taken advantage of, which means we still have great room to improve. And we will continuously explore the area, and get more dramatic results for InnoDB full-text searches. Jimmy Yang, September 29, 2012

Read the article

CodePlex Daily Summary for Wednesday, July 04, 2012

CodePlex Daily Summary for Wednesday, July 04, 2012Popular ReleasesMVC Controls Toolkit: Mvc Controls Toolkit 2.2.0: Added Modified all Mv4 related features to conform with the Mvc4 RC Now all items controls accept any IEnumerable<T>(before just List<T> were accepted by most of controls) retrievalManager class that retrieves automatically data from a data source whenever it catchs events triggered by filtering, sorting, and paging controls move method to the updatesManager to move one child objects from a father to another. The move operation can be undone like the insert, update and delete operatio...BlackJumboDog: Ver5.6.6: 2012.07.03 Ver5.6.6 (1) ???????????ftp://?????????、????LIST?????Mini SQL Query: Mini SQL Query (v1.0.68.441): Just a bug fix release for when the connections try to refresh after an edit. Make sure you read the Quickstart for an introduction.Microsoft Ajax Minifier: Microsoft Ajax Minifier 4.58: Fix for Issue #18296: provide "ALL" value to the -ignore switch to ignore all error and warning messages. Fix for issue #18293: if encountering EOF before a function declaration or expression is properly closed, throw an appropriate error and don't crash. Adjust the variable-renaming algorithm so it's very specific when renaming variables with the same number of references so a single source file ends up with the same minified names on different platforms. add the ability to specify kno...LogExpert: 1.4 build 4566: This release for the 1.4 version line contains various fixes which have been made some times ago. Until now these fixes were only available in the 1.5 alpha versions. It also contains a fix for: 710. Column finder (press F8 to show) Terminal server issues: Multiple sessions with same user should work now Settings Export/Import available via Settings Dialog still incomple (e.g. tab colors are not saved) maybe I change the file format one day no command line support yet (for importin...DynamicToSql: DynamicToSql 1.0.0 (beta): 1.0.0 beta versionCommonLibrary.NET: CommonLibrary.NET 0.9.8.5 - Final Release: A collection of very reusable code and components in C# 4.0 ranging from ActiveRecord, Csv, Command Line Parsing, Configuration, Holiday Calendars, Logging, Authentication, and much more. FluentscriptCommonLibrary.NET 0.9.8 contains a scripting language called FluentScript. Releases notes for FluentScript located at http://fluentscript.codeplex.com/wikipage?action=Edit&title=Release%20Notes&referringTitle=Documentation Fluentscript - 0.9.8.5 - Final ReleaseApplication: FluentScript Versio...SharePoint 2010 Metro UI: SharePoint 2010 Metro UI8: Please review the documentation link for how to install. Installation takes some basic knowledge of how to upload and edit SharePoint Artifact files. Please view the discussions tab for ongoing FAQsnopCommerce. Open source shopping cart (ASP.NET MVC): nopcommerce 2.60: Highlight features & improvements: • Significant performance optimization. • Use AJAX for adding products to the cart. • New flyout mini-shopping cart. • Auto complete suggestions for product searching. • Full-Text support. • EU cookie law support. To see the full list of fixes and changes please visit the release notes page (http://www.nopCommerce.com/releasenotes.aspx).THE NVL Maker: The NVL Maker Ver 3.51: http://download.codeplex.com/Download?ProjectName=nvlmaker&DownloadId=371510 ????：http://115.com/file/beoef05k#THE-NVL-Maker-ver3.51-sim.7z ????：http://www.mediafire.com/file/6tqdwj9jr6eb9qj/THENVLMakerver3.51tra.7z ======================================== ???? ======================================== 3.51 beta ???： ·?????????????????????? ·?????????，?????????0，?????????????????????? ·??????????????????????????? ·?????????????TJS????（EXP??） ·??4:3???，???????????????，??????????? ·?????????...????: ????2.0.3: 1、???????????。 2、????????。 3、????????????。 4、bug??，????。AssaultCube Reloaded: 2.5 Intrepid: Linux has Ubuntu 11.10 32-bit precompiled binaries and Ubuntu 10.10 64-bit precompiled binaries, but you can compile your own as it also contains the source. If you are using Mac or other operating systems, download the Linux package. Try to compile it. If it fails, download a virtual machine. The server pack is ready for both Windows and Linux, but you might need to compile your own for Linux (source included) You should delete /home/config/saved.cfg to reset binds/other stuff If you us...Magelia WebStore Open-source Ecommerce software: Magelia WebStore 2.0: User Right Licensing ContentType version 2.0.267.1Bongiozzo Photosite: Alpha: Just first stable releaseMDS MODELING WORKBOOK: MDS MODELING WORKBOOK: This is the initial release. Works with SQL 2008 R2 Master Data Services. Also works with SQL 2012 Master Data Services but has not been completely tested.Logon Screen Launcher: Logon Screen Launcher 1.3.0: FIXED - Minor handle leak issueBF3Rcon.NET: BF3Rcon.NET 25.0: This update brings the library up to server release R25, which includes the few additions from R21. There are also some minor bug fixes and a couple of other minor changes. In addition, many methods now take advantage of the RconResult class, which will give error information on failed requests; this replaces the bool returned by many methods. There is also an implicit conversion from RconResult to bool (both of which were true on success), so old code shouldn't break. ChangesAdded Player.S...TelerikMvcGridCustomBindingHelper: Version 1.0.15.183-RC: TelerikMvcGridCustomBindingHelper 1.0.15.183 RC This is a RC (release candidate) version, please test and report any error or problem you encounter. Warning: There are many changes in this release and some of them break backward compatibility. Release notes (since 0.5.0-Alpha version): Custom aggregates via an inherited class or inline fluent function Ignore group on aggregates for better performance Projections (restriction of the database columns queried) for an even better performa...PunkBuster™ Screenshot Viewer: PunkBuster™ Screenshot Viewer 1.0: First release of PunkBuster™ Screenshot ViewerDesigning Windows 8 Applications with C# and XAML: Chapters 1 - 7 Release Preview: Source code for all examples from Chapters 1 - 7 for the Release PreviewNew ProjectsAzureMVC4: hiBoonCraft Launcher: BoonCraft Launcher V2.0 See http://352n.dyndns.org for more info on BoonCraftC# to Javascript: Have you ever wanted to automagically have access to the enums you use in your .NET code in the javascript code you're writing for client-side?CMCIC payment gateway provider for NB_Store: CMCIC payment gateway provider for NB_StoreCOFE2 : Cloud Over IFileSystemInfo Entries Extensions: COFE2 enable user to access the user-defined file system on local or foreign computer, using a System.IO-like interface or a RESTful Web API.Directory access via LDAP: .NET library for managing a directory via LDAP.E-mail processing: .NET library for processing e-mail.FAST Search for SharePoint Query Statistics: F4SP Query Statistics scans the FAST for SharePoint Query Logs and presents statistics based on the logs. Total Queries, Top Queries, Queries per user etc...File Backup: This project is an open source windows azure cloud backup win forms application.HanxiaoFu's personal: This will help synchronizing my work done in home and at workLifekostyuk: This is my first project on TFSNet WebSocket Server: NetWebSocket Server is c# based hight performance and scalable Websocket server. Posroid for Windows 8: ?? ??????? ????? ?? ?? ????? ??? ?? ??? ??? ???? ? ? ???, ???8??? ???? ?? ??? ? ? ??? ??? ????? ?????.PowerRules: PowerRules is a group of scripts that help you audit your farm for Configuration Drift (Configuration changes over time)Projet Niloc TETRAS: Student Project to know how to manage and coordinating a team.proyectobanco: PROFE AQUI ESTA EL PROYECTO DISCULPE NOMAS ATT SANCANsheetengine - Isometric HTML5 JavaScript Display Engine: Sheetengine is an HTML5 canvas based isometric display engine for JavaScript. It features textures, z-ordering, shadows, intersecting sheets, object movements.Shiny2: GTS Spoofing program for Generation IV and V of Pokemon.SMS Backup & Restore XML to MySQL: The purpose is to take the XML files created by SMS Backup and Restore (Android) and importing them via a Dropbox/Google Drive synch into a MySQL dbStundenplan TSST: App für Windows Phone um die einzelnen Vertretungspläne der technischen Schule Steinfurt anzusehenswalmacenamiento: Proyecto para el almacenamiento de registrosTFS Work Item Association Check-in Policy: This policy requires TFS source control check-ins to be associated with a single, in-progress task that is assigned to you.TurboTemplate: TurboTemplate is a fast source code generation helper which quick transforms between your SQL database and some templated text of your choice.visblog: this is short summary of my projectVisualHG_fliedonion: This is fork of VisualHG. This will used by improve VisualHG for me. support only Visual Studio 2008 (not SP1). Wave Tag Library: A very simple and modest .wav file tag library. With this library you can load .wav files, edit the tags (equivalent to mp3's ID3 tags) and save back to file.Wordpress: WordPress is web software you can use to create a beautiful website or blog. We like to say that WordPress is both free and priceless at the same time.ZEAL-C02 Bluetooth module Driver for Netduino: A class library for the .NET Micro Framework to support the Zeal-C02 Bluetooth module for Netduino.

Read the article

Introduction to LinqPad Driver for StreamInsight 2.1

- by Roman Schindlauer

We are announcing the availability of the LinqPad driver for StreamInsight 2.1. The purpose of this blog post is to offer a quick introduction into the new features that we added to the StreamInsight LinqPad driver. We’ll show you how to connect to a remote server, how to inspect the entities present of that server, how to compose on top of them and how to manage their lifetime. Installing the driver Info on how to install the driver can be found in an earlier blog post here. Establishing connections As you click on the “Add Connection” link in the left pane you will notice that now it’s possible to build the data context automatically. The new driver appears as an option in the upper list, and if you pick it you will open a connection dialog that lets you connect to a remote StreamInsight server. The connection dialog lets you specify the address of the remote server. You will notice that it’s possible to pick up the binding information from the configuration file of the LinqPad application (which is normally in the same folder as LinqPad.exe and is called LinqPad.exe.config). In order for the context to be generated you need to pick an application from the server. The control is editable hence you can create a new application if you don’t want to make changes to an existing application. If you choose a new application name you will be prompted for confirmation before this gets created. Once you click OK the connection is created and you can start issuing queries against the remote server. If there’s any connectivity error the connection is marked with a red X and you can see the error message informing you what went wrong (i.e., the remote server could not be reached etc.). The context for remote servers Let’s take a look at what happens after we are connected successfully. Every LinqPad query runs inside a context – think of it as a class that wraps all the code that you’re writing. If you’re connecting to a live server the context will contain the following: The application object itself. All entities present in this application (sources, sinks, subjects and processes). The picture below shows a snapshot of the left pane of LinqPad after a successful connection. Every entity on the server has a different icon which will allow users to figure out its purpose. You will also notice that some entities have a string in parentheses following the name. It should be interpreted as such: the first name is the name of the property of the context class and the second name is the name of the entity as it exists on the server. Not all valid entity names are valid identifier names so in cases where we had to make a transformation you see both. Note also that as you hover over the entities you get IntelliSense with their types – more on that later. Remoting is not supported As you play with the entities exposed by the context you will notice that you can’t read and write directly to/from them. If for instance you’re trying to dump the content of an entity you will get an error message telling you that in the current version remoting is not supported. This is because the entity lives on the remote server and dumping its content means reading the events produced by this entity into the local process. ObservableSource.Dump(); Will yield the following error: Reading from a remote 'System.Reactive.Linq.IQbservable`1[System.Int32]' is not supported. Use the 'Microsoft.ComplexEventProcessing.Linq.RemoteProvider.Bind' method to read from the source using a remote observer. This basically tells you that you can call the Bind() method to direct the output of this source to a sink that has to be defined on the remote machine as well. You can’t bring the results to the LinqPad window unless you write code specifically for that. Compose queries You may ask – what's the purpose of all that? After all the same information is present in the EventFlowDebugger, why bother with showing it in LinqPad? First of all, What gets exposed in LinqPad is not what you see in the debugger. In LinqPad we have a property on the context class for every entity that lives on the server. Because LinqPad offers IntelliSense we in fact have much more information about the entity, and more importantly we can compose with that entity very easily. For example, let’s say that this code creates an entity: using (var server = Server.Connect(...)) { var a = server.CreateApplication("WhiteFish"); var src = a .DefineObservable<int>(() => Observable.Range(0, 3)) .Deploy("ObservableSource"); If later we want to compose with the source we have to fetch it and then we can bind something to a.GetObservable<int>("ObservableSource)").Bind(... This means that we had to know a bunch of things about this: that it’s a source, that it’s an observable, it produces a result with payload Int32 and it’s named “ObservableSource”. Only the second and last bits of information are present in the debugger, by the way. As you type in the query window you see that all the entities are present, you get IntelliSense support for them and it’s much easier to make sense of what’s available. Let’s look at a scenario where composition is plausible. With the new programming model it’s possible to create “cold” sources that are parameterized. There was a way to accomplish that even in the previous version by passing parameters to the adapters, but this time it’s much more elegant because the expression declares what parameters are required. Say that we hover the mouse over the ThrottledSource source – we will see that its type is Func<int, int, IQbservable<int>> - this in effect means that we need to pass two int parameters before we can get a source that produces events, and the type for those events is int – in the particular case of my example I had the source produce a range of integers and the two parameters were the start and end of the range. So we see how a developer can create a source that is not running yet. Then someone else (e.g. an administrator) can pass whatever parameters appropriate and run the process. Proxy Types Here’s an interesting scenario – what if someone created a source on a server but they forgot to tell you what type they used. Worse yet, they might have used an anonymous type and even though they can refer to it by name you can’t figure out how to use that type. Let’s walk through an example that shows how you can compose against types you don’t need to have the definition of. This is how we can create a source that returns an anonymous type: Application.DefineObservable(() => Observable.Range(1, 10).Select(i => new { I = i })).Deploy("O1"); Now if we refresh the connection we can see the new source named O1 appear in the list. But what’s more important is that we now have a type to work with. So we can compose a query that refers to the anonymous type. var threshold = new StreamInsightDynamicDriver.TypeProxies.AnonymousType1_0<int>(5); var filter = from i in O1 where i > threshold select i; filter.Deploy("O2"); You will notice that the anonymous type defined with this statement: new { I = i } can now be manipulated by a client that does not have access to it because the LinqPad driver has generated another type in its stead, named StreamInsightDynamicDriver.TypeProxies.AnonymousType1_0. This type has all the properties and fields of the type defined on the server, except in this case we can instantiate values and use it to compose more queries. It is worth noting that the same thing works for types that are not anonymous – the test is if the LinqPad driver can resolve the type or not. If it’s not possible then a new type will be generated that approximates the type that exists on the server. Control metadata In addition to composing processes on top of the existing entities we can do other useful things. We can delete them – nothing new here as we simply access the entities through the Entities collection of the application class. Here is where having their real name in parentheses comes handy. There’s another way to find out what’s behind a property – dump its expression. The first line in the output tells us what’s the name of the entity used to build this property in the context. Runtime information So let’s create a process to see what happens. We can bind a source to a sink and run the resulting process. If you right click on the connection you can refresh it and see the process present in the list of entities. Then you can drag the process to the query window and see that you can have access to process object in the Processes collection of the application. You can then manipulate the process (delete it, read its diagnostic view etc.). Regards, The StreamInsight Team

Read the article

Boost your infrastructure with Coherence into the Cloud

- by Nino Guarnacci

Authors: Nino Guarnacci & Francesco Scarano, at this URL could be found the original article: http://blogs.oracle.com/slc/coherence_into_the_cloud_boost. Thinking about the enterprise cloud, come to mind many possible configurations and new opportunities in enterprise environments. Various customers needs that serve as guides to this new trend are often very different, but almost always united by two main objectives: Elasticity of infrastructure both Hardware and Software Investments related to the progressive needs of the current infrastructure Characteristics of innovation and economy. A concrete use case that I worked on recently demanded the fulfillment of two basic requirements of economy and innovation.The client had the need to manage a variety of data cache, which can process complex queries and parallel computational operations, maintaining the caches in a consistent state on different server instances, on which the application was installed.In addition, the customer was looking for a solution that would allow him to manage the likely situations in load peak during certain times of the year.For this reason, the customer requires a replication site, on which convey part of the requests during periods of peak; the desire was, however, to prevent the immobilization of investments in owned hardware-software architectures; so, to respond to this need, it was requested to seek a solution based on Cloud technologies and architectures already offered by the market. Coherence can already now address the requirements of large cache between different nodes in the cluster, providing further technology to search and parallel computing, with the simultaneous use of all hardware infrastructure resources. Moreover, thanks to the functionality of "Push Replication", which can replicate and update the information contained in the cache, even to a site hosted in the cloud, it is satisfied the need to make resilient infrastructure that can be based also on nodes temporarily housed in the Cloud architectures. There are different types of configurations that can be realized using the functionality "Push-Replication" of Coherence. Configurations can be either: Active - Passive Hub and Spoke Active - Active Multi Master Centralized Replication Whereas the architecture of this particular project consists of two sites (Site 1 and Site Cloud), between which only Site 1 is enabled to write into the cache, it was decided to adopt an Active-Passive Configuration type (Hub and Spoke). If, however, the requirement should change over time, it will be particularly easy to change this configuration in an Active-Active configuration type. Although very simple, the small sample in this post, inspired by the specific project is effective, to better understand the features and capabilities of Coherence and its configurations. Let's create two distinct coherence cluster, located at miles apart, on two different domain contexts, one of them "hosted" at home (on-premise) and the other one hosted by any cloud provider on the network (or just the same laptop to test it :)). These two clusters, which we call Site 1 and Site Cloud, will contain the necessary information, so a simple client can insert data only into the Site 1. On both sites will be subscribed a listener, who listens to the variations of specific objects within the various caches. To implement these features, you need 4 simple classes: CachedResponse.java Represents the POJO class that will be inserted into the cache, and fulfills the task of containing useful information about the hypothetical links navigation ResponseSimulatorHelper.java Represents a link simulator, which has the task of randomly creating objects of type CachedResponse that will be added into the caches CacheCommands.java Represents the model of our example, because it is responsible for receiving instructions from the controller and performing basic operations against the cache, such as insert, delete, update, listening, objects within the cache Shell.java It is our controller, which give commands to be executed within the cache of the two Sites So, summarily, we execute the java class "Shell", asking it to put into the cache 100 objects of type "CachedResponse" through the java class "CacheCommands", then the simulator "ResponseSimulatorHelper" will randomly create new instances of objects "CachedResponse ". Finally, the Shell class will listen to for events occurring within the cache on the Site Cloud, while insertions and deletions are performed on Site 1. Now, we realize the two configurations of two respective sites / cluster: Site 1 and Site Cloud.For the Site 1 we define a cache of type "distributed" with features of "read and write", using the cache class store for the "push replication", a functionality offered by the project "incubator" of Oracle Coherence.For the "Site Cloud" we expect even the definition of “distributed” cache type with tcp proxy feature enabled, so it can receive updates from Site 1. Coherence Cache Config XML file for "storage node" on "Site 1" site1-prod-cache-config.xml Coherence Cache Config XML file for "storage node" on "Site Cloud" site2-prod-cache-config.xml For two clients "Shell" which will connect respectively to the two clusters we have provided two easy access configurations. Coherence Cache Config XML file for Shell on "Site 1" site1-shell-prod-cache-config.xml Coherence Cache Config XML file for Shell on "Site Cloud" site2-shell-prod-cache-config.xml Now, we just have to get everything and run our tests. To start at least one "storage" node (which holds the data) for the "Cloud Site", we can run the standard class provided OOTB by Oracle Coherence com.tangosol.net.DefaultCacheServer with the following parameters and values:-Xmx128m-Xms64m-Dcom.sun.management.jmxremote -Dtangosol.coherence.management=all -Dtangosol.coherence.management.remote=true -Dtangosol.coherence.distributed.localstorage=true -Dtangosol.coherence.cacheconfig=config/site2-prod-cache-config.xml-Dtangosol.coherence.clusterport=9002-Dtangosol.coherence.site=SiteCloud To start at least one "storage" node (which holds the data) for the "Site 1", we can perform again the standard class provided by Coherence com.tangosol.net.DefaultCacheServer with the following parameters and values:-Xmx128m-Xms64m-Dcom.sun.management.jmxremote -Dtangosol.coherence.management=all -Dtangosol.coherence.management.remote=true -Dtangosol.coherence.distributed.localstorage=true -Dtangosol.coherence.cacheconfig=config/site1-prod-cache-config.xml-Dtangosol.coherence.clusterport=9001-Dtangosol.coherence.site=Site1 Then, we start the first client "Shell" for the "Cloud Site", launching the java class it.javac.Shell using these parameters and values: -Xmx64m-Xms64m-Dcom.sun.management.jmxremote -Dtangosol.coherence.management=all -Dtangosol.coherence.management.remote=true -Dtangosol.coherence.distributed.localstorage=false -Dtangosol.coherence.cacheconfig=config/site2-shell-prod-cache-config.xml-Dtangosol.coherence.clusterport=9002-Dtangosol.coherence.site=SiteCloud Finally, we start the second client "Shell" for the "Site 1", re-launching a new instance of class it.javac.Shell using the following parameters and values: -Xmx64m-Xms64m-Dcom.sun.management.jmxremote -Dtangosol.coherence.management=all -Dtangosol.coherence.management.remote=true -Dtangosol.coherence.distributed.localstorage=false -Dtangosol.coherence.cacheconfig=config/site1-shell-prod-cache-config.xml-Dtangosol.coherence.clusterport=9001-Dtangosol.coherence.site=Site1 And now, let’s execute some tests to validate and better understand our configuration. TEST 1The purpose of this test is to load the objects into the "Site 1" cache and seeing how many objects are cached on the "Site Cloud". Within the "Shell" launched with parameters to access the "Site 1", let’s write and run the command: load test/100 Within the "Shell" launched with parameters to access the "Site Cloud" let’s write and run the command: size passive-cache Expected result If all is OK, the first "Shell" has uploaded 100 objects into a cache named "test"; consequently the "push-replication" functionality has updated the "Site Cloud" by sending the 100 objects to the second cluster where they will have been posted into a respective cache, which we named "passive-cache". TEST 2The purpose of this test is to listen to deleting and adding events happening on the "Site 1" and that are replicated within the cache on "Cloud Site". In the "Shell" launched with parameters to access the "Site Cloud" let’s write and run the command: listen passive-cache/name like '%' or a "cohql" query, with your preferred parameters In the "Shell" launched with parameters to access the "Site 1" let’s write and run the following commands: load test/10 load test2/20 delete test/50 Expected result If all is OK, the "Shell" to Site Cloud let us to listen to all the add and delete events within the cache "cache-passive", whose objects satisfy the query condition "name like '%' " (ie, every objects in the cache; you could change the tests and create different queries).Through the Shell to "Site 1" we launched the commands to add and to delete objects on different caches (test and test2). With the "Shell" running on "Site Cloud" we got the evidence (displayed or printed, or in a log file) that its cache has been filled with events and related objects generated by commands executed from the" Shell "on" Site 1 ", thanks to "push-replication" feature. Other tests can be performed, such as, for example, the subscription to the events on the "Site 1" too, using different "cohql" queries, changing the cache configuration, to effectively demonstrate both the potentiality and the versatility produced by these different configurations, even in the cloud, as in our case. More information on how to configure Coherence "Push Replication" can be found in the Oracle Coherence Incubator project documentation at the following link: http://coherence.oracle.com/display/INC10/Home More information on Oracle Coherence "In Memory Data Grid" can be found at the following link: http://www.oracle.com/technetwork/middleware/coherence/overview/index.html To download and execute the whole sources and configurations of the example explained in the above post, click here to download them; After download the last available version of the Push-Replication Pattern library implementation from the Oracle Coherence Incubator site, and download also the related and required version of Oracle Coherence. For simplicity the required .jarS to execute the example (that can be found into the Push-Replication-Pattern download and Coherence Distribution download) are: activemq-core-5.3.1.jar activemq-protobuf-1.0.jar aopalliance-1.0.jar coherence-commandpattern-2.8.4.32329.jar coherence-common-2.2.0.32329.jar coherence-eventdistributionpattern-1.2.0.32329.jar coherence-functorpattern-1.5.4.32329.jar coherence-messagingpattern-2.8.4.32329.jar coherence-processingpattern-1.4.4.32329.jar coherence-pushreplicationpattern-4.0.4.32329.jar coherence-rest.jar coherence.jar commons-logging-1.1.jar commons-logging-api-1.1.jar commons-net-2.0.jar geronimo-j2ee-management_1.0_spec-1.0.jar geronimo-jms_1.1_spec-1.1.1.jar http.jar jackson-all-1.8.1.jar je.jar jersey-core-1.8.jar jersey-json-1.8.jar jersey-server-1.8.jar jl1.0.jar kahadb-5.3.1.jar miglayout-3.6.3.jar org.osgi.core-4.1.0.jar spring-beans-2.5.6.jar spring-context-2.5.6.jar spring-core-2.5.6.jar spring-osgi-core-1.2.1.jar spring-osgi-io-1.2.1.jar At this URL could be found the original article: http://blogs.oracle.com/slc/coherence_into_the_cloud_boost Authors: Nino Guarnacci & Francesco Scarano

Search Results

Search found 4788 results on 192 pages for 'adhoc queries'.

Page 64/192 | < Previous Page | 60 61 62 63 64 65 66 67 68 69 70 71 | Next Page >

- by user45745

- by Jim

- by miccet

- by Sarah

- by SQLOS Team

- by Jake

- by Ken Smith

- by user259284

- by Mohamed Elgharabawy

- by Dor

- by clsmith

- by Rising Star

- by jasonmp85

- by Pablo

- by user454563

- by Chris

- by Joe Mayo

- by Dr.NETjes

- by pinaldave

- by Rob Farley

- by Calvin Sun

- by Roman Schindlauer

- by Nino Guarnacci

< Previous Page | 60 61 62 63 64 65 66 67 68 69 70 71 | Next Page >