mysql tables are lost for DDL of "alter table .. add column ..."

mysql tables are lost for DDL of "alter table .. add column ..."

am 11.08.2011 13:22:36 von hiu

--0016e64e4c588cea9a04aa3903b4
Content-Type: text/plain; charset=GB2312
Content-Transfer-Encoding: quoted-printable

mysql tables are lost for DDL of "alter table .. add column ..."


*1. mysqld's error.log*

110803 3:39:16 InnoDB: Warning: problems renaming
'feel_22/#sql-2635_23d3a8' to 'feel_22/feed_send_1451', 25000 iterations
£¨first 25000£¬fil0fil.c:: fil_rename_tablespace£©
InnoDB: Warning: tablespace './feel_22/#sql-2635_23d3a8.ibd' has i/o ops
stopped for a long time 24999 £¨fil0fil.c::
fil_mutex_enter_and_prepare_for_io£©
110803 3:39:16 InnoDB: Warning: problems renaming
'feel_22/#sql-2635_23d3a8' to 'feel_22/feed_send_1451', 25001 iterations
£¨over 25000, return FALSE£©
110803 3:39:16 [ERROR] Cannot find or open table feel_22/feed_send_1451
from the internal data dictionary of InnoDB though the .frm file for the
table exists. Maybe you have deleted and recreated InnoDB data files but
have forgotten to delete the corresponding .frm files of InnoDB tables, or
you have moved .frm files to another database?
or, the table contains indexes that this version of the engine doesn't
support.

there is another interesting error info:
Error 1005: Can't create table 'feel_01.#sql-57f0_25a510' (errno: -1)


*2. rename tablespace can be successful only if:*
if (node->n_pending > 0 || node->n_pending_flushes > 0)
if (node->modification_counter > node->flush_counter)

*3. we failed to repeat this bug and failed to locate the real reason. *
Here is the information we got:

* it's single table space with innodb plugin 1.0.9 and mysql-5.1.48. we
failed to repeat this issue.
* DDL is done at mid-night, and workload is very very slow(both master and
slave suffered with issue but with very very low probability).
* for 100G ibd file, fsync is so quickly that retry number is 0 for rename
condition waiting, so we don't doult the n_pending_flushes.
* all the sql are blocked when 20000 retry first hit, but unfornantely no
other stack/core info saved.
* we suspect the io handlers are out-of-order,io_handler_threads and
srv_master_thread maybe all os_event_wait_low.

We can not prove the deadlock situation, but if the deadlock is exist in
such situation, the patch maybe help us to suffering occasional table
losting for DDL


--- /tmp/mysql-5.1.48/storage/innodb_plugin/fil/fil0fil.c 2010-06-03
23:50:08.000000000 +0800
+++ storage/innodb_plugin/fil/fil0fil.c 2011-08-11 00:23:31.000000000
+++ +0800
@@ -938,8 +938,24 @@

mutex_exit(&fil_system->mutex);

+
+#ifndef UNIV_HOTBACKUP
+ /* Wake the i/o-handler threads to make sure pending i/o's
are
+ performed */
+ os_aio_simulated_wake_handler_threads();
+
os_thread_sleep(20000);

+ /* Flush tablespaces so that we can close modified files in
the LRU
+ list */
+
+ fil_flush_file_spaces(FIL_TABLESPACE);
+#else
+
+ os_thread_sleep(20000);
+#endif
+
+
count2++;

goto retry;
@@ -2457,6 +2473,11 @@
fputs(" to ", stderr);
ut_print_filename(stderr, new_name);
fprintf(stderr, ", %lu iterations\n", (ulong) count);
+ if (node)
+ fprintf(stderr, "node info: n_pending=3D%lu,
n_pending_flushes=3D%lu"
+ " modification_counter=3D%lu,
flush_counter=3D%lu\n",
+ node->n_pending, node->n_pending_flushes,
node->modification_counter,
+ node->flush_counter);
}

mutex_enter(&fil_system->mutex);


*4. we need yours help to solve the table lost issue*

--0016e64e4c588cea9a04aa3903b4--