Opened 13 years ago

Closed 12 years ago

Last modified 12 years ago

#9773 closed Bug Report - Hang/Deadlock (Fixed)

Deadlocked BE

Reported by: warpme@… Owned by:
Priority: minor Milestone: 0.25
Component: MythTV - General Version: Master Head
Severity: medium Keywords:
Cc: Ticket locked: no

Description

Recently BE deadlocked in state where no any FE can connect. BE status page timeouts. Also query for status via xml timeouts. BE is v0.25pre-1958-g0434cd9-dirty. BE log + trace attached.

Attachments (8)

be.log (45.4 KB) - added by warpme@… 13 years ago.
BE log
trace.txt (21.9 KB) - added by warpme@… 13 years ago.
gdb trace
trace1.txt (26.7 KB) - added by warped <warpme@…> 13 years ago.
Traces after sys upgrade
be1.log (30.3 KB) - added by warped <warpme@…> 13 years ago.
Log after system upgrade
fe1.log (6.1 KB) - added by warped <warpme@…> 13 years ago.
FE Log after system upgrade
BEtrace2_and_MySqlProcList.txt (26.5 KB) - added by warpme@… 13 years ago.
BE deadlock trace + mysql command output
BE2.log (11.1 KB) - added by warpme@… 13 years ago.
BE log
trace6.txt (30.9 KB) - added by warpme@… 13 years ago.
Deadlock with protocol check failure

Download all attachments as: .zip

Change History (24)

Changed 13 years ago by warpme@…

Attachment: be.log added

BE log

Changed 13 years ago by warpme@…

Attachment: trace.txt added

gdb trace

comment:1 Changed 13 years ago by warped <warpme@…>

Previous trace shows corrupted stack. In meantime I decided to do full ArchLinux? system upgrade (2.6.38.6, glibc2.13.5, gt4.7.3, mysql5.5.12, v0.25pre-1971-g2981c13-dirty). Now be deadlocks without corrupted stack - so treaces should be more useful. I'm attaching traces + logs.

Changed 13 years ago by warped <warpme@…>

Attachment: trace1.txt added

Traces after sys upgrade

Changed 13 years ago by warped <warpme@…>

Attachment: be1.log added

Log after system upgrade

Changed 13 years ago by warped <warpme@…>

Attachment: fe1.log added

FE Log after system upgrade

comment:2 Changed 13 years ago by danielk

It looks like in Thread #3 we are perhaps stuck waiting for a reply from the DB.

Changed 13 years ago by warpme@…

BE deadlock trace + mysql command output

Changed 13 years ago by warpme@…

Attachment: BE2.log added

BE log

comment:3 Changed 13 years ago by warpme@…

Hi,

Here is deadlock trace + mysql output from SHOW FULL PROCESSLIST\G per @kormoc request.

comment:4 Changed 13 years ago by danielk

warpme,

Can you please make a backup of your database, delete it, recreate it and restore from the backup. Then report back here if the problem re-occurs or not.

comment:5 Changed 13 years ago by warpme@…

Daniel, According to 9792 we can close ticket with status invalid. Thx for Your help !

comment:6 Changed 13 years ago by warped <warpme@…>

OMG, This is REALLY bad luck. Today again I had deadlock with Protocol version check failure. Trace attached.

Changed 13 years ago by warpme@…

Attachment: trace6.txt added

Deadlock with protocol check failure

comment:7 Changed 13 years ago by stuartm

Status: newinfoneeded_new

What version of mysql is being used and which distribution? I'd like to see if there are any common factors with other database related deadlocks.

comment:8 Changed 13 years ago by warpme@…

Hi,

I'm on 5.1.57. Distro is ArchLinux?.
I do fresh build of mysql and all related packages.
In my logs attached to 9773 & 9792 I have issue with DB as deadlock is in mysql libraries (Daniel's comment #5 in 9792).
Recently:

-I change DB scheme to InnoDB (2 weeks ago)

-switch to ActiveEIT (week ago)

Since switch to InnoDB I don't have yet any deadlock in mysql libraries, but last week I had deadlock in scheduler. I have traces so if You are interested I can provide trace from this last scheduler deadlock.

br

comment:9 Changed 13 years ago by sphery

Status: infoneeded_newnew

seems that there are still problems, even after warpme's changes mentioned in the comment above: http://www.gossamer-threads.com/lists/mythtv/users/486714#486714

comment:10 Changed 13 years ago by Github

Enable automatic reconnection for MySQL connections.

After 155138a127 changed the minimum supported Qt version to 4.6, we now have support for MYSQL_OPT_RECONNECT in QSqlDatabase::setConnectOptions().

Note that before MySQL 5, reconnecting connections were the default in the MySQL C API (so, were always enabled for the Qt-MySQL driver). Unfortunately, depending on how the application is written, reconnections may cause issues (since reconnections have effects on transactions, table locks, temporary tables, session and user variables, prepared statements, and such), so MySQL 5.0.3 changed the default to disable automatic reconnections, and MySQL 5.0.13 added MYSQL_OPT_RECONNECT to make reconnection an application-controllable option. Since the Qt-MySQL driver didn't have support for specifying MYSQL_OPT_RECONNECT, it was impossible to get reconnecting connections with MySQL 5.0.3+ in Qt applications. Finally in Qt4.6, Qt devs added support for specifying MYSQL_OPT_RECONNECT, so we can re-enable automatic reconnect.

Thanks to David Edwards for the reminder to set this option.

Refs #9704, #9773, #9792.

Branch: master Changeset: dd2d7a14d10c55f8bc8d12e944adb3b3c9ac8cea

comment:11 Changed 13 years ago by Github

Fix SQL reconnection logic. Refs #9704. Refs #9773. Refs #9792.

What this does is get rid of the old KickDatabase?() logic and replaces it with a simple reconnect if the DB socket is disconnected, and also gets rid of a recent re-enablement of the automatic reconnects in [dd2d7a14d1] and importantly it extends the reconnection logic in [528324d8ba] so that it resends the prepare() and rebinds the bindings when a reconnect happens in an exec().

This also makes the QSqlQuery parent of MSqlQuery private and adds some wrappers so we can prevent MSqlQuery users from accessing methods that may invalidate some assumption in MSqlQuery + it adds debugging wrappers for first(), last(), previous(), and seek() like we already had for next().

A little bit of history is in order.

In Qt3 the SQL driver automatically reconnected whenever a connection was lost. But the first query after this auto-reconnect always failed. So in 2003 mdz added KickDatabase?() which would recover disconnect and reconnect an automatically all was good with the world.

In Qt4 the QSqlQuery::prepare() started to actually do prepared statements instead of emulating them. This caused a small problem in now it was possible that a disconnect happened between the call to prepare() and exec() and then the exec() would fail. But odds of this were still pretty low for most uses of the database in MythTV. But at some point trolltech also got rid of the automatic reconnections which caused us trouble in Qt3. To fix this we added reconnection logic in [528324d8ba] and this is where the real trouble began. While this reconnection logic really does exactly the same thing that the auto-reconnection logic did in Qt3, the fact that prepare() doesn't just emulate prepared statements in Qt4 means that it's now that when we reconnect to the database in exec() the prepared statement has not been sent to the database.

Note: This should be back-ported to 0.24-fixes after a reasonable burn-in time in master.

Branch: master Changeset: 4dfcdb8dd0c80d096ed95d5eccd489c692efa90d

comment:12 Changed 13 years ago by Github

Few bits missing in [4dfcdb8dd]. Refs #9704. Refs #9773. Refs #9792.

Branch: master Changeset: 1fb0a77fc17089b5cd2bf36ef07998bf68461261

comment:13 Changed 13 years ago by danielk

Note: until all the db reconnection issues are resolved and this ticket is possibly fixed, I will only be referencing the #9704 on reconnection related commits.

comment:14 Changed 12 years ago by warpme@…

Hi, After Daniel's MySql? reconnect commits I didn't observe exactly this issue within 4 weeks on 24/7 working production system. From my perspective we can close this ticket with status resolved.

comment:15 Changed 12 years ago by stuartm

Resolution: Fixed
Status: newclosed

comment:16 Changed 12 years ago by beirdo

Milestone: unknown0.25
Note: See TracTickets for help on using tickets.