Search Results

Search found 34274 results on 1371 pages for 'mysql table'.

Page 37/1371 | < Previous Page | 33 34 35 36 37 38 39 40 41 42 43 44  | Next Page >

  • MySQL - Exclude rows from Select based on duplication of two columns

    - by Carson C.
    I am attempting to narrow results of an existing complex query based on conditional matches on multiple columns within the returned data set. I'll attempt to simplify the data as much as possible here. Assume that the following table structure represents the data that my existing complex query has already selected (here ordered by date): +----+-----------+------+------------+ | id | remote_id | type | date | +----+-----------+------+------------+ | 1 | 1 | A | 2011-01-01 | | 3 | 1 | A | 2011-01-07 | | 5 | 1 | B | 2011-01-07 | | 4 | 1 | A | 2011-05-01 | +----+-----------+------+------------+ I need to select from that data set based on the following criteria: If the pairing of remote_id and type is unique to the set, return the row always If the pairing of remote_id and type is not unique to the set, take the following action: Of the sets of rows for which the pairing of remote_id and type are not unique, return only the single row for which date is greatest and still less than or equal to now. So, if today is 2010-01-10, I'd like the data set returned to be: +----+-----------+------+------------+ | id | remote_id | type | date | +----+-----------+------+------------+ | 3 | 1 | A | 2011-01-07 | | 5 | 1 | B | 2011-01-07 | +----+-----------+------+------------+ For some reason I'm having no luck wrapping my head around this one. I suspect the answer lies in good application of group_by, but I just can't grasp it. Any help is greatly appreciated!

    Read the article

  • mySQL Optimization Suggestions

    - by Brian Schroeter
    I'm trying to optimize our mySQL configuration for our large Magento website. The reason I believe that mySQL needs to be configured further is because New Relic has shown that our SELECT queries are taking a long time (20,000+ ms) in some categories. I ran MySQLTuner 1.3.0 and got the following results... (Disclaimer: I restarted mySQL earlier after tweaking some settings, and so the results here may not be 100% accurate): >> MySQLTuner 1.3.0 - Major Hayden <[email protected]> >> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering [OK] Currently running supported MySQL version 5.5.37-35.0 [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: +ARCHIVE +BLACKHOLE +CSV -FEDERATED +InnoDB +MRG_MYISAM [--] Data in MyISAM tables: 7G (Tables: 332) [--] Data in InnoDB tables: 213G (Tables: 8714) [--] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 17) [--] Data in MEMORY tables: 0B (Tables: 353) [!!] Total fragmented tables: 5492 -------- Security Recommendations ------------------------------------------- [!!] User '@host5.server1.autopartsnetwork.com' has no password set. [!!] User '@localhost' has no password set. [!!] User 'root@%' has no password set. -------- Performance Metrics ------------------------------------------------- [--] Up for: 5h 3m 4s (5M q [317.443 qps], 42K conn, TX: 18B, RX: 2B) [--] Reads / Writes: 95% / 5% [--] Total buffers: 35.5G global + 184.5M per thread (1024 max threads) [!!] Maximum possible memory usage: 220.0G (174% of installed RAM) [OK] Slow queries: 0% (6K/5M) [OK] Highest usage of available connections: 5% (61/1024) [OK] Key buffer size / total MyISAM indexes: 512.0M/3.1G [OK] Key buffer hit rate: 100.0% (102M cached / 45K reads) [OK] Query cache efficiency: 66.9% (3M cached / 5M selects) [!!] Query cache prunes per day: 3486361 [OK] Sorts requiring temporary tables: 0% (0 temp sorts / 812K sorts) [!!] Joins performed without indexes: 1328 [OK] Temporary tables created on disk: 11% (126K on disk / 1M total) [OK] Thread cache hit rate: 99% (61 created / 42K connections) [!!] Table cache hit rate: 19% (9K open / 49K opened) [OK] Open file limit used: 2% (712/25K) [OK] Table locks acquired immediately: 100% (5M immediate / 5M locks) [!!] InnoDB buffer pool / data size: 32.0G/213.4G [OK] InnoDB log waits: 0 -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance MySQL started within last 24 hours - recommendations may be inaccurate Reduce your overall MySQL memory footprint for system stability Enable the slow query log to troubleshoot bad queries Increasing the query_cache size over 128M may reduce performance Adjust your join queries to always utilize indexes Increase table_cache gradually to avoid file descriptor limits Read this before increasing table_cache over 64: http://bit.ly/1mi7c4C Variables to adjust: *** MySQL's maximum memory usage is dangerously high *** *** Add RAM before increasing MySQL buffer variables *** query_cache_size (> 512M) [see warning above] join_buffer_size (> 128.0M, or always use indexes with joins) table_cache (> 12288) innodb_buffer_pool_size (>= 213G) My my.cnf configuration is as follows... [client] port = 3306 [mysqld_safe] nice = 0 [mysqld] tmpdir = /var/lib/mysql/tmp user = mysql port = 3306 skip-external-locking character-set-server = utf8 collation-server = utf8_general_ci event_scheduler = 0 key_buffer = 512M max_allowed_packet = 64M thread_stack = 512K thread_cache_size = 512 sort_buffer_size = 24M read_buffer_size = 8M read_rnd_buffer_size = 24M join_buffer_size = 128M # for some nightly processes client sessions set the join buffer to 8 GB auto-increment-increment = 1 auto-increment-offset = 1 myisam-recover = BACKUP max_connections = 1024 # max connect errors artificially high to support behaviors of NetScaler monitors max_connect_errors = 999999 concurrent_insert = 2 connect_timeout = 5 wait_timeout = 180 net_read_timeout = 120 net_write_timeout = 120 back_log = 128 # this table_open_cache might be too low because of MySQL bugs #16244691 and #65384) table_open_cache = 12288 tmp_table_size = 512M max_heap_table_size = 512M bulk_insert_buffer_size = 512M open-files-limit = 8192 open-files = 1024 query_cache_type = 1 # large query limit supports SOAP and REST API integrations query_cache_limit = 4M # larger than 512 MB query cache size is problematic; this is typically ~60% full query_cache_size = 512M # set to true on read slaves read_only = false slow_query_log_file = /var/log/mysql/slow.log slow_query_log = 0 long_query_time = 0.2 expire_logs_days = 10 max_binlog_size = 1024M binlog_cache_size = 32K sync_binlog = 0 # SSD RAID10 technically has a write capacity of 10000 IOPS innodb_io_capacity = 400 innodb_file_per_table innodb_table_locks = true innodb_lock_wait_timeout = 30 # These servers have 80 CPU threads; match 1:1 innodb_thread_concurrency = 48 innodb_commit_concurrency = 2 innodb_support_xa = true innodb_buffer_pool_size = 32G innodb_file_per_table innodb_flush_log_at_trx_commit = 1 innodb_log_buffer_size = 2G skip-federated [mysqldump] quick quote-names single-transaction max_allowed_packet = 64M I have a monster of a server here to power our site because our catalog is very large (300,000 simple SKUs), and I'm just wondering if I'm missing anything that I can configure further. :-) Thanks!

    Read the article

  • MySQL – Export the Resultset to CSV file

    - by Pinal Dave
    In SQL Server, you can use BCP command to export the result set to a csv file. In MySQL too, You can export data from a table or result set as a csv file in many methods. Here are two methods. Method 1 : Make use of Work Bench If you are using Work Bench as a querying tool, you can make use of it’s Export option in the result window. Run the following code in Work Bench SELECT db_names FROM mysql_testing; The result will be shown in the result windows. There is an option called “File”. Click on it and it will prompt you a window to save the result set (Screen shot attached to show how file option can be used). Choose the directory and type out the name of the file. Method 2 : Make use of OUTFILE command You can do the export using a query with OUTFILE command as shown below SELECT db_names FROM mysql_testing INTO OUTFILE 'C:/testing.csv' FIELDS ENCLOSED BY '"' TERMINATED BY ';' ESCAPED BY '"' LINES TERMINATED BY '\r\n'; After the execution of the above code, you can find a file named testing.csv in C drive of the server. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Tips and Tricks, T SQL Tagged: CSV

    Read the article

  • MySQL – Introduction to CONCAT and CONCAT_WS functions

    - by Pinal Dave
    MySQL supports two types of concatenation functions. They are CONCAT and CONCAT_WS CONCAT function just concats all the argument values as such SELECT CONCAT('Television','Mobile','Furniture'); The above code returns the following TelevisionMobileFurniture If you want to concatenate them with a comma, either you need to specify the comma at the end of each value, or pass comma as an argument along with the values SELECT CONCAT('Television,','Mobile,','Furniture'); SELECT CONCAT('Television',',','Mobile',',','Furniture'); Both the above return the following Television,Mobile,Furniture However you can omit the extra work by using CONCAT_WS function. It stands for Concatenate with separator. This is very similar to CONCAT function, but accepts separator as the first argument. SELECT CONCAT_WS(',','Television','Mobile','Furniture'); The result is Television,Mobile,Furniture If you want pipeline as a separator, you can use SELECT CONCAT_WS('|','Television','Mobile','Furniture'); The result is Television|Mobile|Furniture So CONCAT_WS is very flexible in concatenating values along with separate. Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: MySQL, PostADay, SQL, SQL Authority, SQL Query, SQL Tips and Tricks, T SQL

    Read the article

  • MySQL syntax: can't create table

    - by peng
    mysql> create table balance_sheet( -> Cash_and_cash_equivalents VARCHAR(20), -> Trading_financial_assets VARCHAR(20), -> Note_receivable VARCHAR(20), -> Account_receivable VARCHAR(20), -> Advance_money VARCHAR(20), -> Interest_receivable VARCHAR(20), -> Dividend_receivable VARCHAR(20), -> Other_notes_receivable VARCHAR(20), -> Due_from_related_parties VARCHAR(20), -> Inventory VARCHAR(20), -> Consumptive_biological_assets VARCHAR(20), -> Non_current_asset(expire_in_a_year) VARCHAR(20), -> Other_current_assets VARCHAR(20), -> Total_current_assets VARCHAR(20), -> Available_for_sale_financial_assets VARCHAR(20), -> Held_to_maturity_investment VARCHAR(20), -> Long_term_account_receivable VARCHAR(20), -> Long_term_equity_investment VARCHAR(20), -> Investment_real_eastate VARCHAR(20), -> Fixed_assets VARCHAR(20), -> Construction_in_progress VARCHAR(20), -> Project_material VARCHAR(20), -> Liquidation_of_fixed_assets VARCHAR(20), -> Capitalized_biological_assets VARCHAR(20), -> Oil_and_gas_assets VARCHAR(20), -> Intangible_assets VARCHAR(20), -> R&d_expense VARCHAR(20), -> Goodwill VARCHAR(20), -> Deferred_assets VARCHAR(20), -> Deferred_income_tax_assets VARCHAR(20), -> Other_non_current_assets VARCHAR(20), -> Total_non_current_assets VARCHAR(20), -> Total_assets VARCHAR(20), -> Short_term_borrowing VARCHAR(20), -> Transaction_financial_liabilities VARCHAR(20), -> Notes_payable VARCHAR(20), -> Account_payable VARCHAR(20), -> Item_received_in_advance VARCHAR(20), -> Employee_pay_payable VARCHAR(20), -> Tax_payable VARCHAR(20), -> Interest_payable VARCHAR(20), -> Dividend_payable VARCHAR(20), -> Other_account_payable VARCHAR(20), -> Due_to_related_parties VARCHAR(20), -> Non_current_liabilities_due_within_one_year VARCHAR(20), -> Other_current_liabilities VARCHAR(20), -> Total_current_liabilities VARCHAR(20), -> Long_term_loan VARCHAR(20), -> Bonds_payable VARCHAR(20), -> Long_term_payable VARCHAR(20), -> Specific_payable VARCHAR(20), -> Estimated_liabilities VARCHAR(20), -> Deferred_income_tax_liabilities VARCHAR(20), -> Other_non_current_liabilities VARCHAR(20), -> Total_non_current_liabilities VARCHAR(20), -> Total_liabilities VARCHAR(20), -> Paid_in_capital VARCHAR(20), -> Contributed_surplus VARCHAR(20), -> Treasury_stock VARCHAR(20), -> Earned_surplus VARCHAR(20), -> Retained_earnings VARCHAR(20), -> Translation_reserve VARCHAR(20), -> Nonrecurring_items VARCHAR(20), -> Total_equity(non) VARCHAR(20), -> Minority_interests VARCHAR(20), -> Total_equity VARCHAR(20), -> Total_liabilities_&_shareholder's_equity VARCHAR(20)); '> '> when i press enter,there is the output of ' ,no other reaction ,what's wrong?

    Read the article

  • php MySQL snytax error

    - by Jacksta
    my scrip is supposed to look up contacts in a table and present thm on the screen to then be edited. however this is not this case. I am getting the error Parse error: syntax error, unexpected $end in /home/admin/domains/domain.com.au/public_html/pick_modcontact.php on line 50 NOTE: this is the last line in this script. <? session_start(); if ($_SESSION[valid] != "yes") { header( "Location: contact_menu.php"); exit; } $db_name = "testDB"; $table_name = "my_contacts"; $connection = @mysql_connect("localhost", "user", "pass") or die(mysql_error()); $db = @mysql_select_db($db_name, $connection) or die(mysql_error()); $sql = "SELECT id, f_name, l_name FROM $table_name ORDER BY f_name"; $result = @mysql_query($sql, $connection) or die(mysql_error()); $num = @mysql_num_rows($result); if ($num < 1) { $display_block = "<p><em>Sorry No Results!</em></p>"; } else { while ($row = mysql_fetch_array($result)) { $id = $row['id']; $f_name = $row['f_name']; $l_name = $row['l_name']; $option_block .= "<option value\"$id\">$l_name, $f_name</option>"; } $display_block = "<form method=\"POST\" action=\"show_modcontact.php\"> <p><strong>Contact:</strong> <select name=\"id\">$option_block</select> <input type=\"submit\" name=\"submit\" value=\"Select This Contact\"></p> </form>"; ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Modify A Contact</title> </head> <body> <h1>My Contact Management System</h1> <h2><em>Modify a Contact</em></h2> <p>Select a contact from the list below, to modify the contact's record.</p> <? echo "$display_block"; ?> <br> <p><a href="contact_menu.php">Return to Main Menu</a></p> </body> </html>

    Read the article

  • php mysql array - insert array info into mysql

    - by Michael
    I need to insert mutiple data in the field and then to retrieve it as an array. For example I need to insert "99999" into table item_details , field item_number, and the following data into field bidders associated with item_number : userx usery userz Can you please let me know what sql query should I use to insert the info and what query to retrieve it ? I know that this may be a silly question but I just can't figure it out . thank you in advance, Michael .

    Read the article

  • PHP Can't connect to MySQL on the production server

    - by Jairo Santos
    I'm having problems with connections with MySQL through PHP script. The MySQL user is root and I added GRANTS to root@'%' so I can connect from anywhere. Lets assume my MySQL host as "bigboy.com.br" The funny part is, from my local machine, on my test server, the script can connect to the MySQL server normally. But on the dedicated server where MySQL is running, the same PHP script gives me "Access denied for 'root'@'bigboy.com.br'" error.

    Read the article

  • How can I centralise MySQL data between 3 or more geographically separate servers?

    - by Andy Castles
    To explain the background to the question: We have a home-grown PHP application (for running online language-learning courses) running on a Linux server and using MySQL on localhost for saving user data (e.g. results of tests taken, marks of submitted work, time spent on different pages in the courses, etc). As we have students from different geographic locations we currently have 3 virtual servers hosted close to those locations (Spain, UK and Hong Kong) and users are added to the server closest to them (they access via different URLs, e.g. europe.domain.com, uk.domain.com and asia.domain.com). This works but is an administrative nightmare as we have to remember which server a particular user is on, and users can only connect to one server. We would like to somehow centralise the information so that all users are visible on any of the servers and users could connect to any of the 3 servers. The question is, what method should we use to implement this. It must be an issue that that lots of people have encountered but I haven't found anything conclusive after a fair bit of Googling around. The closest I have seen to solutions are: something like master-master replication, but I have read so many posts suggesting that this is not a good idea as things like auto_increment fields can break. circular replication, this sounded perfect but to quote from O'Reilly's High Performance MySQL, "In general, rings are brittle and best avoided" We're not against rewriting code in the application to make it work with whatever solution is required but I am not sure if replication is the correct thing to use. Thanks, Andy P.S. I should add that we experimented with writes to a central database and then using reads from a local database but the response time between the different servers for writing was pretty bad and it's also important that written data is available immediately for reading so if replication is too slow this could cause out-of-date data to be returned. Edit: I have been thinking about writing my own rudimentary replication script which would involve something like having each user given a server ID to say which is his "home server", e.g. users in asia would be marked as having the Hong Kong server as their own server. Then the replication scripts (which would be a PHP script set to run as a cron job reasonably frequently, e.g. every 15 minutes or so) would run independently on each of the servers in the system. They would go through the database and distribute any information about users with the "home server" set to the server that the script is running on to all of the other databases in the system. They would also need to suck new information which has been added to any of the other databases on the system where the "home server" flag is the server where the script is running. I would need to work out the details and build in the logic to deal with conflicts but I think it would be possible, however I wanted to make sure that there is not a correct solution for this already out there as it seems like it must be a problem that many people have already come across.

    Read the article

  • Symlink error when installing MySQL via Homebrew

    - by Asad Syed
    Trying to install MySQL via Homebrew. The install seems to work fine but i get an error: "Error: The linking step did not complete successfully The formula built, but is not symlinked into /usr/local You can try again using `brew link mysql'" Naturally, after this I ran: brew link mysql Which spat out: Error: Could not symlink file: /usr/local/Cellar/mysql/5.5.20/include/typelib.h /usr/local/include is not writable. You should change its permissions. So I ran it with sudo and got a "cowardly refusing to brew link mysql".

    Read the article

  • Change the mysql.sock that the system PHP uses

    - by Mark P.
    In the past I had installed MySQL as part of XAMPP in /Applications/XAMPP/ and the PHP that the system used (e.g. from command line) used /Applications/XAMPP/xamppfiles/var/mysql/mysql.sock to connect with MySQL. Now I have installed MySQL 5 with macports and I want my system PHP to be able to use that. I think I must set PHP to use /opt/local/var/run/mysql5/mysqld.sock but where do I set this? Is there a php.ini for CLI PHP?

    Read the article

  • nginx can't see MySQL

    - by user135235
    I have a fully working Joomla 2.5.6 install driven by a local MySQL server, but I'd like to test nginx to see if it's a faster web serving experience than Apache. \ PHP 5.4.6 (PHP54w) \ CentOS 6.2 \ Joomla 2.5.6 \ PHP54w-fpm.i386 (FastCGI process manager) \ php -m shows: mysql & mysqli modules loaded Nginx seems to have installed fine via yum, it can process a PHP-info file via FastCGI perfectly OK (http://37.128.190.241/php.php) but when I stop Apache, start nginx instead and visit my site I get: "Database connection error (1): The MySQL adapter 'mysqli' is not available." I've tried adjusting my Joomla configuration.php to use mysql instead of mysqli but I get the same basic error, only this time "Database connection error (1): The MySQL adapter 'mysql' is not available" of course! Can anyone think what the problem might be please? I did try explicitly setting extension = mysqli.so and extension = mysql.so in my php.ini to try and force the issue (despite php -m showing they were both successfully loaded anyway) - no difference. I have a pretty standard nginx default.conf: server { listen 80; server_name www.MYDOMAIN.com; server_name_in_redirect off; access_log /var/log/nginx/localhost.access_log main; error_log /var/log/nginx/localhost.error_log info; root /var/www/html/MYROOT_DIR; index index.php index.html index.htm default.html default.htm; # Support Clean (aka Search Engine Friendly) URLs location / { try_files $uri $uri/ /index.php?q=$uri&$args; } # deny running scripts inside writable directories location ~* /(images|cache|media|logs|tmp)/.*\.(php|pl|py|jsp|asp|sh|cgi)$ { return 403; error_page 403 /403_error.html; } location ~ \.php$ { fastcgi_pass 127.0.0.1:9000; fastcgi_index index.php; include fastcgi_params; fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name; include /etc/nginx/fastcgi.conf; } # caching of files location ~* \.(ico|pdf|flv)$ { expires 1y; } location ~* \.(js|css|png|jpg|jpeg|gif|swf|xml|txt)$ { expires 14d; } } Snip of output from phpinfo under nginx: Server API FPM/FastCGI Virtual Directory Support disabled Configuration File (php.ini) Path /etc Loaded Configuration File /etc/php.ini Scan this dir for additional .ini files /etc/php.d Additional .ini files parsed /etc/php.d/curl.ini, /etc/php.d/fileinfo.ini, /etc/php.d/json.ini, /etc/php.d/phar.ini, /etc/php.d/zip.ini Snip of output from phpinfo under Apache: Server API Apache 2.0 Handler Virtual Directory Support disabled Configuration File (php.ini) Path /etc Loaded Configuration File /etc/php.ini Scan this dir for additional .ini files /etc/php.d Additional .ini files parsed /etc/php.d/curl.ini, /etc/php.d/fileinfo.ini, /etc/php.d/json.ini, /etc/php.d/mysql.ini, /etc/php.d/mysqli.ini, /etc/php.d/pdo.ini, /etc/php.d/pdo_mysql.ini, /etc/php.d/pdo_sqlite.ini, /etc/php.d/phar.ini, /etc/php.d/sqlite3.ini, /etc/php.d/zip.ini Seems that with Apache, PHP is loading substantially more additional .ini files, including ones relating to mysql (mysql.ini, mysqli.ini, pdo_mysql.ini) than nginx. Any ideas how I get nginix to also call these additional .ini's ? Thanks in advance, Steve

    Read the article

  • Not able to installl mysql-server ubuntu

    - by makdotgnu
    I'm not able to install mysql-server on ubuntu 10.04 I had installed mysql but it was giving this error ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) So I removed mysql-server completely from synaptic but after this i'm not able to reinstall it. When I try to reinstall it synaptic fridges. how to do remove each file of mysql and install it ?

    Read the article

  • mysql server overloading without error

    - by beny
    Hi, I have a serious problem on my server with MySQL server, it overload itself without any error in /var/log/mysqld What steps should I do to find out the problem ? my.cnf is [mysqld] set-variable=local-infile=0 datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock user=mysql old_passwords=1 skip-bdb set-variable = innodb_buffer_pool_size=256M set-variable = innodb_additional_mem_pool_size=20M set-variable = innodb_log_file_size=128M set-variable = innodb_log_buffer_size=8M innodb_data_file_path = ibdata1:1000M:autoextend Please help, thx

    Read the article

  • Cloning a VM to add a new MySQL Slave

    - by Ben Holness
    I am in the process of adding a new slave to a replicated mysql setup. All of the slave nodes are virtual machines. If I clone one of the nodes to a new VM, then start it with no networking, stop mysql, change the server-id in my.cnf to a new id and then restart mysql and networking, will it all work correctly, or will mysql get confused because it used to be a different server id? OS: Ubuntu 10.10 VM Platform: VMWare 5 MySQL : Server version: 5.1.49-1ubuntu8.1-log (Ubuntu)

    Read the article

  • What is the fastest method to restore MySQL replication?

    - by dwhere
    I have a MySQL (5.1) master-slave replication pair and replication to the slave has failed. It failed because the master ran out of disk space and the relay-logs became corrupt. The master is now back online and working properly. Since there is this error in the log the slave process can't simply be restarted. The server has a single 40GB InnoDB database and I would like to know what is the fastest method for getting the slave back in sync to minimize downtime.

    Read the article

  • mysql on ubunto 4 is not running and is not remotely accessible

    - by user628119
    Currently i installed ubunot on ubunto 4.4, and using following commands i can see mysql, mysqld running ps -ef | grep mysql ps -ef | grep mysqld but when I run, netstat i don't see mysql and 3306 anywhere. in my.cnf file, i have given my ip and port is 3306. also when i run this command sudo netstat -tap | grep mysql i don't see anything and commands I needed to run the mysql 5 on port 3306 and on ip=x.x.x.x for remotely accessible Looking forward to your reply

    Read the article

  • Unresolved external symbol - MySQL API C++

    - by Zack_074
    Hey I've had nonstop problems with SQL. I'm trying to get some experience because I know it's a vital part of the industry. I got it working with C#, but now I'm working on connecting to a database in c++. I have the project properly linked and what not. Here's my code and the errors I'm getting. #include "stdafx.h" #include <mysql.h> #include <iostream> MYSQL mysql; MYSQL_RES result; using namespace std; int _tmain(int argc, _TCHAR* argv[]) { mysql_init(&mysql); if(!mysql_real_connect(&mysql, "localhost", "root", "angel552002", "MyDatabse", 0, NULL, 0)) { printf("Failed to connect"); } return 0; } and the errors: Error 1 error LNK2001: unresolved external symbol _mysql_real_connect@32 c:\Users\Zack-074\documents\visual studio 2010\Projects\MySql\MySql\MySql.obj Error 2 error LNK2001: unresolved external symbol _mysql_init@4 c:\Users\Zack-074\documents\visual studio 2010\Projects\MySql\MySql\MySql.obj Error 3 error LNK1120: 2 unresolved externals c:\users\zack-074\documents\visual studio 2010\Projects\MySql\Debug\MySql.exe 1 I really appreciate the help.

    Read the article

  • MYSQL not running on Ubuntu OS - Error 2002.

    - by mgj
    Hi, I am a novice to mysql DB. I am trying to run the MYSQL Server on Ubuntu 10.04. Through Synaptic Package Manager I am have installed the mysql version: mysql-client-5.1 I wonder that how was the database password set for the mysql-client software that I installed through the above way.It would be nice if you could enlighten me on this. When I tried running this database, I encountered the error given below: mohnish@mohnish-laptop:/var/lib$ mysql ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) mohnish@mohnish-laptop:/var/lib$ I referred to a similar question posted by another user. I didn't find a solution through the proposed answers. For instance when I tried the solutions posted for the similar question I got the following: mohnish@mohnish-laptop:/var/lib$ service start mysqld start: unrecognized service mohnish@mohnish-laptop:/var/lib$ ps -u mysql ERROR: User name does not exist. ********* simple selection ********* ********* selection by list ********* -A all processes -C by command name -N negate selection -G by real group ID (supports names) -a all w/ tty except session leaders -U by real user ID (supports names) -d all except session leaders -g by session OR by effective group name -e all processes -p by process ID T all processes on this terminal -s processes in the sessions given a all w/ tty, including other users -t by tty g OBSOLETE -- DO NOT USE -u by effective user ID (supports names) r only running processes U processes for specified users x processes w/o controlling ttys t by tty *********** output format ********** *********** long options *********** -o,o user-defined -f full --Group --User --pid --cols --ppid -j,j job control s signal --group --user --sid --rows --info -O,O preloaded -o v virtual memory --cumulative --format --deselect -l,l long u user-oriented --sort --tty --forest --version -F extra full X registers --heading --no-heading --context ********* misc options ********* -V,V show version L list format codes f ASCII art forest -m,m,-L,-T,H threads S children in sum -y change -l format -M,Z security data c true command name -c scheduling class -w,w wide output n numeric WCHAN,UID -H process hierarchy mohnish@mohnish-laptop:/var/lib$ which mysql /usr/bin/mysql mohnish@mohnish-laptop:/var/lib$ mysql ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) I even tried referring to http://forums.mysql.com/read.php?11,27769,84713#msg-84713 but couldn't find anything useful. Please let me know how I could tackle this error. Thank you very much..

    Read the article

  • MySQL Config File for Large System

    - by Jonathon
    We are running MySQL on a Windows 2003 Server Enterpise Edition box. MySQL is about the only program running on the box. We have approx. 8 slaves replicated to it, but my understanding is that having multiple slaves connecting to the same master does not significantly slow down performance, if at all. The master server has 16G RAM, 10 Terabyte drives in RAID 10, and four dual-core processors. From what I have seen from other sites, we have a really robust machine as our master db server. We just upgraded from a machine with only 4G RAM, but with similar hard drives, RAID, etc. It also ran Apache on it, so it was our db server and our application server. It was getting a little slow, so we split the db server onto this new machine and kept the application server on the first machine. We also distributed the application load amongst a few of our other slave servers, which also run the application. The problem is the new db server has mysqld.exe consuming 95-100% of CPU almost all the time and is really causing the app to run slowly. I know we have several queries and table structures that could be better optimized, but since they worked okay on the older, smaller server, I assume that our my.ini (MySQL config) file is not properly configured. Most of what I see on the net is for setting config files on small machines, so can anyone help me get the my.ini file correct for a large dedicated machine like ours? I just don't see how mysqld could get so bogged down! FYI: We have about 100 queries per second. We only use MyISAM tables, so skip-innodb is set in the ini file. And yes, I know it is reading the ini file correctly because I can change some settings (like the server-id and it will kill the server at startup). Here is the my.ini file: #MySQL Server Instance Configuration File # ---------------------------------------------------------------------- # Generated by the MySQL Server Instance Configuration Wizard # # # Installation Instructions # ---------------------------------------------------------------------- # # On Linux you can copy this file to /etc/my.cnf to set global options, # mysql-data-dir/my.cnf to set server-specific options # (@localstatedir@ for this installation) or to # ~/.my.cnf to set user-specific options. # # On Windows you should keep this file in the installation directory # of your server (e.g. C:\Program Files\MySQL\MySQL Server X.Y). To # make sure the server reads the config file use the startup option # "--defaults-file". # # To run run the server from the command line, execute this in a # command line shell, e.g. # mysqld --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" # # To install the server as a Windows service manually, execute this in a # command line shell, e.g. # mysqld --install MySQLXY --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" # # And then execute this in a command line shell to start the server, e.g. # net start MySQLXY # # # Guildlines for editing this file # ---------------------------------------------------------------------- # # In this file, you can use all long options that the program supports. # If you want to know the options a program supports, start the program # with the "--help" option. # # More detailed information about the individual options can also be # found in the manual. # # # CLIENT SECTION # ---------------------------------------------------------------------- # # The following options will be read by MySQL client applications. # Note that only client applications shipped by MySQL are guaranteed # to read this section. If you want your own MySQL client program to # honor these values, you need to specify it as an option during the # MySQL client library initialization. # [client] port=3306 [mysql] default-character-set=latin1 # SERVER SECTION # ---------------------------------------------------------------------- # # The following options will be read by the MySQL Server. Make sure that # you have installed the server correctly (see above) so it reads this # file. # [mysqld] # The TCP/IP Port the MySQL Server will listen on port=3306 #Path to installation directory. All paths are usually resolved relative to this. basedir="D:/MySQL/" #Path to the database root datadir="D:/MySQL/data" # The default character set that will be used when a new schema or table is # created and no character set is defined default-character-set=latin1 # The default storage engine that will be used when create new tables when default-storage-engine=MYISAM # Set the SQL mode to strict #sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION" # we changed this because there are a couple of queries that can get blocked otherwise sql-mode="" #performance configs skip-locking max_allowed_packet = 1M table_open_cache = 512 # The maximum amount of concurrent sessions the MySQL server will # allow. One of these connections will be reserved for a user with # SUPER privileges to allow the administrator to login even if the # connection limit has been reached. max_connections=1510 # Query cache is used to cache SELECT results and later return them # without actual executing the same query once again. Having the query # cache enabled may result in significant speed improvements, if your # have a lot of identical queries and rarely changing tables. See the # "Qcache_lowmem_prunes" status variable to check if the current value # is high enough for your load. # Note: In case your tables change very often or if your queries are # textually different every time, the query cache may result in a # slowdown instead of a performance improvement. query_cache_size=168M # The number of open tables for all threads. Increasing this value # increases the number of file descriptors that mysqld requires. # Therefore you have to make sure to set the amount of open files # allowed to at least 4096 in the variable "open-files-limit" in # section [mysqld_safe] table_cache=3020 # Maximum size for internal (in-memory) temporary tables. If a table # grows larger than this value, it is automatically converted to disk # based table This limitation is for a single table. There can be many # of them. tmp_table_size=30M # How many threads we should keep in a cache for reuse. When a client # disconnects, the client's threads are put in the cache if there aren't # more than thread_cache_size threads from before. This greatly reduces # the amount of thread creations needed if you have a lot of new # connections. (Normally this doesn't give a notable performance # improvement if you have a good thread implementation.) thread_cache_size=64 #*** MyISAM Specific options # The maximum size of the temporary file MySQL is allowed to use while # recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE. # If the file-size would be bigger than this, the index will be created # through the key cache (which is slower). myisam_max_sort_file_size=100G # If the temporary file used for fast index creation would be bigger # than using the key cache by the amount specified here, then prefer the # key cache method. This is mainly used to force long character keys in # large tables to use the slower key cache method to create the index. myisam_sort_buffer_size=64M # Size of the Key Buffer, used to cache index blocks for MyISAM tables. # Do not set it larger than 30% of your available memory, as some memory # is also required by the OS to cache rows. Even if you're not using # MyISAM tables, you should still set it to 8-64M as it will also be # used for internal temporary disk tables. key_buffer_size=3072M # Size of the buffer used for doing full table scans of MyISAM tables. # Allocated per thread, if a full scan is needed. read_buffer_size=2M read_rnd_buffer_size=8M # This buffer is allocated when MySQL needs to rebuild the index in # REPAIR, OPTIMZE, ALTER table statements as well as in LOAD DATA INFILE # into an empty table. It is allocated per thread so be careful with # large settings. sort_buffer_size=2M #*** INNODB Specific options *** innodb_data_home_dir="D:/MySQL InnoDB Datafiles/" # Use this option if you have a MySQL server with InnoDB support enabled # but you do not plan to use it. This will save memory and disk space # and speed up some things. skip-innodb # Additional memory pool that is used by InnoDB to store metadata # information. If InnoDB requires more memory for this purpose it will # start to allocate it from the OS. As this is fast enough on most # recent operating systems, you normally do not need to change this # value. SHOW INNODB STATUS will display the current amount used. innodb_additional_mem_pool_size=11M # If set to 1, InnoDB will flush (fsync) the transaction logs to the # disk at each commit, which offers full ACID behavior. If you are # willing to compromise this safety, and you are running small # transactions, you may set this to 0 or 2 to reduce disk I/O to the # logs. Value 0 means that the log is only written to the log file and # the log file flushed to disk approximately once per second. Value 2 # means the log is written to the log file at each commit, but the log # file is only flushed to disk approximately once per second. innodb_flush_log_at_trx_commit=1 # The size of the buffer InnoDB uses for buffering log data. As soon as # it is full, InnoDB will have to flush it to disk. As it is flushed # once per second anyway, it does not make sense to have it very large # (even with long transactions). innodb_log_buffer_size=6M # InnoDB, unlike MyISAM, uses a buffer pool to cache both indexes and # row data. The bigger you set this the less disk I/O is needed to # access data in tables. On a dedicated database server you may set this # parameter up to 80% of the machine physical memory size. Do not set it # too large, though, because competition of the physical memory may # cause paging in the operating system. Note that on 32bit systems you # might be limited to 2-3.5G of user level memory per process, so do not # set it too high. innodb_buffer_pool_size=500M # Size of each log file in a log group. You should set the combined size # of log files to about 25%-100% of your buffer pool size to avoid # unneeded buffer pool flush activity on log file overwrite. However, # note that a larger logfile size will increase the time needed for the # recovery process. innodb_log_file_size=100M # Number of threads allowed inside the InnoDB kernel. The optimal value # depends highly on the application, hardware as well as the OS # scheduler properties. A too high value may lead to thread thrashing. innodb_thread_concurrency=10 #replication settings (this is the master) log-bin=log server-id = 1 Thanks for all the help. It is greatly appreciated.

    Read the article

  • MySQL InnoDB Cascade Rule that looks at 2 columns?

    - by Travis
    I have the following MySQL InnoDB tables... TABLE foldersA ( ID title ) TABLE foldersB ( ID title ) TABLE records ( ID folderID folderType title ) folderID in table "records" can point to ID in either "foldersA" or "foldersB" depending on the value of folderType. (0 or 1). I am wondering: Is there a way to create a CASCADE rule such that the appropriate rows in table records are automatically deleted when a row in either foldersA or folderB is deleted? Or in this situation, am I forced to have to delete the rows in table "records" programatically? Thanks for you help!

    Read the article

  • Extracting Data Daily from MySQL to a Local MySQL DB

    - by Sunny Juneja
    I'm doing some experiments locally that require some data from a production MySQL DB that I only have read access to. The schemas are nearly identical with the exception of the omission of one column. My goal is to write a script that I can run everyday that extracts the previous day's data and imports it into my local table. The part that I'm most confused about is how to download the data. I've seen names like mysqldump be tossed around but that seems a way to replicate the entire database. I would love to avoid using php seeing as I have no experience with it. I've been creating CSVs but I'm worried about having the data integrity (what if there is a comma in a field or a \n) as well as the size of the CSV (there are several hundred thousand rows per day).

    Read the article

  • php/mySQL error: mysql_num_rows(): supplied argument is not a valid MySQL result

    - by Michael Robinson
    I'm trying to INSERT INTO a mySQL database and I'm getting this error on: if (mysql_num_rows($login) == 1){ Here is the php, The php does add the user to the database. I can't figure it out. <? session_start(); require("config.php"); $u = $_GET['username']; $pw = $_GET['password']; $pwh = $_GET['passwordhint']; $em = $_GET['email']; $zc = $_GET['zipcode']; $check = "INSERT INTO family (loginName, email, password, passwordhint, location) VALUES ('$u', '$pw', '$pwh', '$em', '$zc')"; $login = mysql_query($check, $link) or die(mysql_error()); if (mysql_num_rows($login) == 1) { $row = mysql_fetch_assoc($login); echo 'Yes';exit; } else { echo 'No';exit; } mysql_close($link); ?> Thanks,

    Read the article

  • Improving Partitioned Table Join Performance

    - by Paul White
    The query optimizer does not always choose an optimal strategy when joining partitioned tables. This post looks at an example, showing how a manual rewrite of the query can almost double performance, while reducing the memory grant to almost nothing. Test Data The two tables in this example use a common partitioning partition scheme. The partition function uses 41 equal-size partitions: CREATE PARTITION FUNCTION PFT (integer) AS RANGE RIGHT FOR VALUES ( 125000, 250000, 375000, 500000, 625000, 750000, 875000, 1000000, 1125000, 1250000, 1375000, 1500000, 1625000, 1750000, 1875000, 2000000, 2125000, 2250000, 2375000, 2500000, 2625000, 2750000, 2875000, 3000000, 3125000, 3250000, 3375000, 3500000, 3625000, 3750000, 3875000, 4000000, 4125000, 4250000, 4375000, 4500000, 4625000, 4750000, 4875000, 5000000 ); GO CREATE PARTITION SCHEME PST AS PARTITION PFT ALL TO ([PRIMARY]); There two tables are: CREATE TABLE dbo.T1 ( TID integer NOT NULL IDENTITY(0,1), Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T1 PRIMARY KEY CLUSTERED (TID) ON PST (TID) );   CREATE TABLE dbo.T2 ( TID integer NOT NULL, Column1 integer NOT NULL, Padding binary(100) NOT NULL DEFAULT 0x,   CONSTRAINT PK_T2 PRIMARY KEY CLUSTERED (TID, Column1) ON PST (TID) ); The next script loads 5 million rows into T1 with a pseudo-random value between 1 and 5 for Column1. The table is partitioned on the IDENTITY column TID: INSERT dbo.T1 WITH (TABLOCKX) (Column1) SELECT (ABS(CHECKSUM(NEWID())) % 5) + 1 FROM dbo.Numbers AS N WHERE n BETWEEN 1 AND 5000000; In case you don’t already have an auxiliary table of numbers lying around, here’s a script to create one with 10 million rows: CREATE TABLE dbo.Numbers (n bigint PRIMARY KEY);   WITH L0 AS(SELECT 1 AS c UNION ALL SELECT 1), L1 AS(SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B), L2 AS(SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B), L3 AS(SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B), L4 AS(SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B), L5 AS(SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B), Nums AS(SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS n FROM L5) INSERT dbo.Numbers WITH (TABLOCKX) SELECT TOP (10000000) n FROM Nums ORDER BY n OPTION (MAXDOP 1); Table T1 contains data like this: Next we load data into table T2. The relationship between the two tables is that table 2 contains ‘n’ rows for each row in table 1, where ‘n’ is determined by the value in Column1 of table T1. There is nothing particularly special about the data or distribution, by the way. INSERT dbo.T2 WITH (TABLOCKX) (TID, Column1) SELECT T.TID, N.n FROM dbo.T1 AS T JOIN dbo.Numbers AS N ON N.n >= 1 AND N.n <= T.Column1; Table T2 ends up containing about 15 million rows: The primary key for table T2 is a combination of TID and Column1. The data is partitioned according to the value in column TID alone. Partition Distribution The following query shows the number of rows in each partition of table T1: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T1 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are 40 partitions containing 125,000 rows (40 * 125k = 5m rows). The rightmost partition remains empty. The next query shows the distribution for table 2: SELECT PartitionID = CA1.P, NumRows = COUNT_BIG(*) FROM dbo.T2 AS T CROSS APPLY (VALUES ($PARTITION.PFT(TID))) AS CA1 (P) GROUP BY CA1.P ORDER BY CA1.P; There are roughly 375,000 rows in each partition (the rightmost partition is also empty): Ok, that’s the test data done. Test Query and Execution Plan The task is to count the rows resulting from joining tables 1 and 2 on the TID column: SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; The optimizer chooses a plan using parallel hash join, and partial aggregation: The Plan Explorer plan tree view shows accurate cardinality estimates and an even distribution of rows across threads (click to enlarge the image): With a warm data cache, the STATISTICS IO output shows that no physical I/O was needed, and all 41 partitions were touched: Running the query without actual execution plan or STATISTICS IO information for maximum performance, the query returns in around 2600ms. Execution Plan Analysis The first step toward improving on the execution plan produced by the query optimizer is to understand how it works, at least in outline. The two parallel Clustered Index Scans use multiple threads to read rows from tables T1 and T2. Parallel scan uses a demand-based scheme where threads are given page(s) to scan from the table as needed. This arrangement has certain important advantages, but does result in an unpredictable distribution of rows amongst threads. The point is that multiple threads cooperate to scan the whole table, but it is impossible to predict which rows end up on which threads. For correct results from the parallel hash join, the execution plan has to ensure that rows from T1 and T2 that might join are processed on the same thread. For example, if a row from T1 with join key value ‘1234’ is placed in thread 5’s hash table, the execution plan must guarantee that any rows from T2 that also have join key value ‘1234’ probe thread 5’s hash table for matches. The way this guarantee is enforced in this parallel hash join plan is by repartitioning rows to threads after each parallel scan. The two repartitioning exchanges route rows to threads using a hash function over the hash join keys. The two repartitioning exchanges use the same hash function so rows from T1 and T2 with the same join key must end up on the same hash join thread. Expensive Exchanges This business of repartitioning rows between threads can be very expensive, especially if a large number of rows is involved. The execution plan selected by the optimizer moves 5 million rows through one repartitioning exchange and around 15 million across the other. As a first step toward removing these exchanges, consider the execution plan selected by the optimizer if we join just one partition from each table, disallowing parallelism: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = 1 AND $PARTITION.PFT(T2.TID) = 1 OPTION (MAXDOP 1); The optimizer has chosen a (one-to-many) merge join instead of a hash join. The single-partition query completes in around 100ms. If everything scaled linearly, we would expect that extending this strategy to all 40 populated partitions would result in an execution time around 4000ms. Using parallelism could reduce that further, perhaps to be competitive with the parallel hash join chosen by the optimizer. This raises a question. If the most efficient way to join one partition from each of the tables is to use a merge join, why does the optimizer not choose a merge join for the full query? Forcing a Merge Join Let’s force the optimizer to use a merge join on the test query using a hint: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN); This is the execution plan selected by the optimizer: This plan results in the same number of logical reads reported previously, but instead of 2600ms the query takes 5000ms. The natural explanation for this drop in performance is that the merge join plan is only using a single thread, whereas the parallel hash join plan could use multiple threads. Parallel Merge Join We can get a parallel merge join plan using the same query hint as before, and adding trace flag 8649: SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (MERGE JOIN, QUERYTRACEON 8649); The execution plan is: This looks promising. It uses a similar strategy to distribute work across threads as seen for the parallel hash join. In practice though, performance is disappointing. On a typical run, the parallel merge plan runs for around 8400ms; slower than the single-threaded merge join plan (5000ms) and much worse than the 2600ms for the parallel hash join. We seem to be going backwards! The logical reads for the parallel merge are still exactly the same as before, with no physical IOs. The cardinality estimates and thread distribution are also still very good (click to enlarge): A big clue to the reason for the poor performance is shown in the wait statistics (captured by Plan Explorer Pro): CXPACKET waits require careful interpretation, and are most often benign, but in this case excessive waiting occurs at the repartitioning exchanges. Unlike the parallel hash join, the repartitioning exchanges in this plan are order-preserving ‘merging’ exchanges (because merge join requires ordered inputs): Parallelism works best when threads can just grab any available unit of work and get on with processing it. Preserving order introduces inter-thread dependencies that can easily lead to significant waits occurring. In extreme cases, these dependencies can result in an intra-query deadlock, though the details of that will have to wait for another time to explore in detail. The potential for waits and deadlocks leads the query optimizer to cost parallel merge join relatively highly, especially as the degree of parallelism (DOP) increases. This high costing resulted in the optimizer choosing a serial merge join rather than parallel in this case. The test results certainly confirm its reasoning. Collocated Joins In SQL Server 2008 and later, the optimizer has another available strategy when joining tables that share a common partition scheme. This strategy is a collocated join, also known as as a per-partition join. It can be applied in both serial and parallel execution plans, though it is limited to 2-way joins in the current optimizer. Whether the optimizer chooses a collocated join or not depends on cost estimation. The primary benefits of a collocated join are that it eliminates an exchange and requires less memory, as we will see next. Costing and Plan Selection The query optimizer did consider a collocated join for our original query, but it was rejected on cost grounds. The parallel hash join with repartitioning exchanges appeared to be a cheaper option. There is no query hint to force a collocated join, so we have to mess with the costing framework to produce one for our test query. Pretending that IOs cost 50 times more than usual is enough to convince the optimizer to use collocated join with our test query: -- Pretend IOs are 50x cost temporarily DBCC SETIOWEIGHT(50);   -- Co-located hash join SELECT COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID OPTION (RECOMPILE);   -- Reset IO costing DBCC SETIOWEIGHT(1); Collocated Join Plan The estimated execution plan for the collocated join is: The Constant Scan contains one row for each partition of the shared partitioning scheme, from 1 to 41. The hash repartitioning exchanges seen previously are replaced by a single Distribute Streams exchange using Demand partitioning. Demand partitioning means that the next partition id is given to the next parallel thread that asks for one. My test machine has eight logical processors, and all are available for SQL Server to use. As a result, there are eight threads in the single parallel branch in this plan, each processing one partition from each table at a time. Once a thread finishes processing a partition, it grabs a new partition number from the Distribute Streams exchange…and so on until all partitions have been processed. It is important to understand that the parallel scans in this plan are different from the parallel hash join plan. Although the scans have the same parallelism icon, tables T1 and T2 are not being co-operatively scanned by multiple threads in the same way. Each thread reads a single partition of T1 and performs a hash match join with the same partition from table T2. The properties of the two Clustered Index Scans show a Seek Predicate (unusual for a scan!) limiting the rows to a single partition: The crucial point is that the join between T1 and T2 is on TID, and TID is the partitioning column for both tables. A thread that processes partition ‘n’ is guaranteed to see all rows that can possibly join on TID for that partition. In addition, no other thread will see rows from that partition, so this removes the need for repartitioning exchanges. CPU and Memory Efficiency Improvements The collocated join has removed two expensive repartitioning exchanges and added a single exchange processing 41 rows (one for each partition id). Remember, the parallel hash join plan exchanges had to process 5 million and 15 million rows. The amount of processor time spent on exchanges will be much lower in the collocated join plan. In addition, the collocated join plan has a maximum of 8 threads processing single partitions at any one time. The 41 partitions will all be processed eventually, but a new partition is not started until a thread asks for it. Threads can reuse hash table memory for the new partition. The parallel hash join plan also had 8 hash tables, but with all 5,000,000 build rows loaded at the same time. The collocated plan needs memory for only 8 * 125,000 = 1,000,000 rows at any one time. Collocated Hash Join Performance The collated join plan has disappointing performance in this case. The query runs for around 25,300ms despite the same IO statistics as usual. This is much the worst result so far, so what went wrong? It turns out that cardinality estimation for the single partition scans of table T1 is slightly low. The properties of the Clustered Index Scan of T1 (graphic immediately above) show the estimation was for 121,951 rows. This is a small shortfall compared with the 125,000 rows actually encountered, but it was enough to cause the hash join to spill to physical tempdb: A level 1 spill doesn’t sound too bad, until you realize that the spill to tempdb probably occurs for each of the 41 partitions. As a side note, the cardinality estimation error is a little surprising because the system tables accurately show there are 125,000 rows in every partition of T1. Unfortunately, the optimizer uses regular column and index statistics to derive cardinality estimates here rather than system table information (e.g. sys.partitions). Collocated Merge Join We will never know how well the collocated parallel hash join plan might have worked without the cardinality estimation error (and the resulting 41 spills to tempdb) but we do know: Merge join does not require a memory grant; and Merge join was the optimizer’s preferred join option for a single partition join Putting this all together, what we would really like to see is the same collocated join strategy, but using merge join instead of hash join. Unfortunately, the current query optimizer cannot produce a collocated merge join; it only knows how to do collocated hash join. So where does this leave us? CROSS APPLY sys.partitions We can try to write our own collocated join query. We can use sys.partitions to find the partition numbers, and CROSS APPLY to get a count per partition, with a final step to sum the partial counts. The following query implements this idea: SELECT row_count = SUM(Subtotals.cnt) FROM ( -- Partition numbers SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1 ) AS P CROSS APPLY ( -- Count per collocated join SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals; The estimated plan is: The cardinality estimates aren’t all that good here, especially the estimate for the scan of the system table underlying the sys.partitions view. Nevertheless, the plan shape is heading toward where we would like to be. Each partition number from the system table results in a per-partition scan of T1 and T2, a one-to-many Merge Join, and a Stream Aggregate to compute the partial counts. The final Stream Aggregate just sums the partial counts. Execution time for this query is around 3,500ms, with the same IO statistics as always. This compares favourably with 5,000ms for the serial plan produced by the optimizer with the OPTION (MERGE JOIN) hint. This is another case of the sum of the parts being less than the whole – summing 41 partial counts from 41 single-partition merge joins is faster than a single merge join and count over all partitions. Even so, this single-threaded collocated merge join is not as quick as the original parallel hash join plan, which executed in 2,600ms. On the positive side, our collocated merge join uses only one logical processor and requires no memory grant. The parallel hash join plan used 16 threads and reserved 569 MB of memory:   Using a Temporary Table Our collocated merge join plan should benefit from parallelism. The reason parallelism is not being used is that the query references a system table. We can work around that by writing the partition numbers to a temporary table (or table variable): SET STATISTICS IO ON; DECLARE @s datetime2 = SYSUTCDATETIME();   CREATE TABLE #P ( partition_number integer PRIMARY KEY);   INSERT #P (partition_number) SELECT p.partition_number FROM sys.partitions AS p WHERE p.[object_id] = OBJECT_ID(N'T1', N'U') AND p.index_id = 1;   SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals;   DROP TABLE #P;   SELECT DATEDIFF(Millisecond, @s, SYSUTCDATETIME()); SET STATISTICS IO OFF; Using the temporary table adds a few logical reads, but the overall execution time is still around 3500ms, indistinguishable from the same query without the temporary table. The problem is that the query optimizer still doesn’t choose a parallel plan for this query, though the removal of the system table reference means that it could if it chose to: In fact the optimizer did enter the parallel plan phase of query optimization (running search 1 for a second time): Unfortunately, the parallel plan found seemed to be more expensive than the serial plan. This is a crazy result, caused by the optimizer’s cost model not reducing operator CPU costs on the inner side of a nested loops join. Don’t get me started on that, we’ll be here all night. In this plan, everything expensive happens on the inner side of a nested loops join. Without a CPU cost reduction to compensate for the added cost of exchange operators, candidate parallel plans always look more expensive to the optimizer than the equivalent serial plan. Parallel Collocated Merge Join We can produce the desired parallel plan using trace flag 8649 again: SELECT row_count = SUM(Subtotals.cnt) FROM #P AS p CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: One difference between this plan and the collocated hash join plan is that a Repartition Streams exchange operator is used instead of Distribute Streams. The effect is similar, though not quite identical. The Repartition uses round-robin partitioning, meaning the next partition id is pushed to the next thread in sequence. The Distribute Streams exchange seen earlier used Demand partitioning, meaning the next partition id is pulled across the exchange by the next thread that is ready for more work. There are subtle performance implications for each partitioning option, but going into that would again take us too far off the main point of this post. Performance The important thing is the performance of this parallel collocated merge join – just 1350ms on a typical run. The list below shows all the alternatives from this post (all timings include creation, population, and deletion of the temporary table where appropriate) from quickest to slowest: Collocated parallel merge join: 1350ms Parallel hash join: 2600ms Collocated serial merge join: 3500ms Serial merge join: 5000ms Parallel merge join: 8400ms Collated parallel hash join: 25,300ms (hash spill per partition) The parallel collocated merge join requires no memory grant (aside from a paltry 1.2MB used for exchange buffers). This plan uses 16 threads at DOP 8; but 8 of those are (rather pointlessly) allocated to the parallel scan of the temporary table. These are minor concerns, but it turns out there is a way to address them if it bothers you. Parallel Collocated Merge Join with Demand Partitioning This final tweak replaces the temporary table with a hard-coded list of partition ids (dynamic SQL could be used to generate this query from sys.partitions): SELECT row_count = SUM(Subtotals.cnt) FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10), (11),(12),(13),(14),(15),(16),(17),(18),(19),(20), (21),(22),(23),(24),(25),(26),(27),(28),(29),(30), (31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41) ) AS P (partition_number) CROSS APPLY ( SELECT cnt = COUNT_BIG(*) FROM dbo.T1 AS T1 JOIN dbo.T2 AS T2 ON T2.TID = T1.TID WHERE $PARTITION.PFT(T1.TID) = p.partition_number AND $PARTITION.PFT(T2.TID) = p.partition_number ) AS SubTotals OPTION (QUERYTRACEON 8649); The actual execution plan is: The parallel collocated hash join plan is reproduced below for comparison: The manual rewrite has another advantage that has not been mentioned so far: the partial counts (per partition) can be computed earlier than the partial counts (per thread) in the optimizer’s collocated join plan. The earlier aggregation is performed by the extra Stream Aggregate under the nested loops join. The performance of the parallel collocated merge join is unchanged at around 1350ms. Final Words It is a shame that the current query optimizer does not consider a collocated merge join (Connect item closed as Won’t Fix). The example used in this post showed an improvement in execution time from 2600ms to 1350ms using a modestly-sized data set and limited parallelism. In addition, the memory requirement for the query was almost completely eliminated  – down from 569MB to 1.2MB. The problem with the parallel hash join selected by the optimizer is that it attempts to process the full data set all at once (albeit using eight threads). It requires a large memory grant to hold all 5 million rows from table T1 across the eight hash tables, and does not take advantage of the divide-and-conquer opportunity offered by the common partitioning. The great thing about the collocated join strategies is that each parallel thread works on a single partition from both tables, reading rows, performing the join, and computing a per-partition subtotal, before moving on to a new partition. From a thread’s point of view… If you have trouble visualizing what is happening from just looking at the parallel collocated merge join execution plan, let’s look at it again, but from the point of view of just one thread operating between the two Parallelism (exchange) operators. Our thread picks up a single partition id from the Distribute Streams exchange, and starts a merge join using ordered rows from partition 1 of table T1 and partition 1 of table T2. By definition, this is all happening on a single thread. As rows join, they are added to a (per-partition) count in the Stream Aggregate immediately above the Merge Join. Eventually, either T1 (partition 1) or T2 (partition 1) runs out of rows and the merge join stops. The per-partition count from the aggregate passes on through the Nested Loops join to another Stream Aggregate, which is maintaining a per-thread subtotal. Our same thread now picks up a new partition id from the exchange (say it gets id 9 this time). The count in the per-partition aggregate is reset to zero, and the processing of partition 9 of both tables proceeds just as it did for partition 1, and on the same thread. Each thread picks up a single partition id and processes all the data for that partition, completely independently from other threads working on other partitions. One thread might eventually process partitions (1, 9, 17, 25, 33, 41) while another is concurrently processing partitions (2, 10, 18, 26, 34) and so on for the other six threads at DOP 8. The point is that all 8 threads can execute independently and concurrently, continuing to process new partitions until the wider job (of which the thread has no knowledge!) is done. This divide-and-conquer technique can be much more efficient than simply splitting the entire workload across eight threads all at once. Related Reading Understanding and Using Parallelism in SQL Server Parallel Execution Plans Suck © 2013 Paul White – All Rights Reserved Twitter: @SQL_Kiwi

    Read the article

< Previous Page | 33 34 35 36 37 38 39 40 41 42 43 44  | Next Page >