Call for help on a crashing bug in MySQL 5.0
Markus Popp was successful when asking for help to make a bug repeatable. So why not try it myself?
As I wrote in my field report we hit a few crashing bugs after upgrading to 5.0.15. With 5.0.16 and after rewriting some queries things have almost stabilized now and we enjoy a great product.
However there still remains one nasty crashing bug I couldn't yet add to the bugs database as it's not fully repeatable: We run a MediaWiki (1.4.4 default install) for some documentation on our intranet. The following simple query from MediaWiki to determine a user id from a user name on login sometimes crashes the server:
/* User::idFromName */ SELECT user_id FROM `user` WHERE user_name='username' LIMIT 1
We only have a few logins per day, but the query still crashes probably around 3 times a week. Thats the output from SHOW CREATE TABLE:
CREATE TABLE `user` ( `user_id` int(5) unsigned NOT NULL auto_increment, `user_name` varchar(255) character set latin1 collate latin1_bin NOT NULL default '', `user_real_name` varchar(255) character set latin1 collate latin1_bin NOT NULL default '', `user_password` tinyblob NOT NULL, `user_newpassword` tinyblob NOT NULL, `user_email` tinytext NOT NULL, `user_options` blob NOT NULL, `user_touched` varchar(14) character set latin1 collate latin1_bin NOT NULL default '', `user_token` varchar(32) character set latin1 collate latin1_bin NOT NULL default '', PRIMARY KEY (`user_id`), KEY `user_name` (`user_name`(10)) ) ENGINE=MyISAM DEFAULT CHARSET=latin1
A table check returns no problems, the table contains only 5 rows (internal users only).
That's the crash info with a resolved stack trace from MySQL 5.0.16-standard-log-i686-glibc23 running on Linux 2.4.21 (SuSE Professional):
mysqld got signal 11; ... key_buffer_size=536870912 read_buffer_size=2093056 max_used_connections=64 max_connections=300 threads_connected=11 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 1751885 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd=0x665ac628 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... Cannot determine thread, fp=0x661f2768, backtrace may not be correct. Stack range sanity check OK, backtrace follows: 0x8150650 handle_segfault + 356 0x4003b96c _end + 934239804 0x814155d _ZNK12Field_string4typeEv + 13 0x81e4d6c _Z11get_mm_leafP13st_qsel_paramP4ItemP5FieldP11st_key_partN9Item_func8FunctypeES2_ + 916 0x81e2fde _Z12get_mm_partsP13st_qsel_paramP4ItemP5FieldN9Item_func8FunctypeES2_11Item_result + 286 0x81e14b3 _Z11get_mm_treeP13st_qsel_paramP4Item + 1339 0x81db127 _ZN10SQL_SELECT17test_quick_selectEP3THD6BitmapILj64EEyyb + 1283 0x81a107f _Z22get_quick_record_countP3THDP10SQL_SELECTP8st_tablePK6BitmapILj64EEy + 59 0x8198546 _Z20make_join_statisticsP4JOINP13st_table_listP4ItemP16st_dynamic_array + 2550 0x818f75a _ZN4JOIN8optimizeEv + 750 0x8192946 _Z12mysql_selectP3THDPPP4ItemP13st_table_listjR4ListIS1_ES2_jP8st_orderSB_S2_SB_mP13select_resultP18st_select_lex_unitP13st_sel + 130 0x818ebd2 _Z13handle_selectP3THDP6st_lexP13select_resultm + 234 0x8164175 _Z21mysql_execute_commandP3THD + 601 0x816ae7a _Z11mysql_parseP3THDPcj + 306 0x8162aa2 _Z16dispatch_command19enum_server_commandP3THDPcj + 1178 0x81625cd _Z10do_commandP3THD + 129 0x8161aad handle_one_connection + 569 0x40035f60 _end + 934216752 0x401cb327 _end + 935876599 New value of fp=(nil) failed sanity check, terminating stack trace! ... Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at 0x8cf51f0 = /* User::idFromName */ SELECT user_id FROM `user` WHERE user_name='Bastienne' LIMIT 1 thd->thread_id=178439 The manual page at http://www.mysql.com/doc/en/Crashing.html contains information that should help you find out what is causing the crash. Number of processes running now: 0
Maybe somebody familiar with MySQL's internals can determine from the stack trace what's going on here and at least point me into the right direction. Or somebody else also runs a MediaWiki and experiences something similar. I'd be very happy to finally make a repeatable test case out of this! Thanks in advance!