Opened 12 years ago

Closed 12 years ago

#10302 closed Bug Report - General (Duplicate)

Backend socket problems on Centos 6

Reported by: Jonathan Martens <jonathan@…> Owned by:
Priority: minor Milestone: unknown
Component: MythTV - General Version: Master Head
Severity: medium Keywords:
Cc: Ticket locked: no

Description

I have seen there are a few bugs concerning the communication between the backend and frontend on CentOS 6.x.

I am also running into this issue with latest trunk, which manifests itself in mythfrontends logs like this:

2012-02-01 22:37:28.763081 I  MythCoreContext: Connecting to backend server: 10.0.10.228:6543 (try 1 of 1)
2012-02-01 22:37:28.763523 I  MythSocket(27b03a0:44): IP is local, using loopback address instead
2012-02-01 22:37:28.763538 I  MythSocket(27b03a0:44): attempting connect() to (127.0.0.1:6543)
2012-02-01 22:37:28.763573 I  MSocketDevice::connect: setting Protocol to IPv4
2012-02-01 22:37:28.763579 I  MSocketDevice::connect: attempting to create new socket
2012-02-01 22:37:28.763928 I  MythSocket(27b03a0:44): write -> 44 30      MYTH_PROTO_VERSION 72 D78EFD6F
2012-02-01 22:37:35.765089 E  MythSocket(27b03a0:44): readStringList: Error, timed out after 7000 ms.
2012-02-01 22:37:35.765181 C  Protocol version check failure.
                        The response to MYTH_PROTO_VERSION was empty.
                        This happens when the backend is too busy to respond,
                        or has deadlocked in due to bugs or hardware failure.
2012-02-01 22:37:35.765231 C  Unable to determine master backend time zone settings.  If those settings differ from local settings, some functionality will fail.

While investigating this I started looking through my mythbackend logs and found no evidence off the command ever arriving at the backend, tcpdump confirms that no reaction is given at all by the backend.

If I look at the output of netstat I see the following after the error:

[jonathan@localhost ~]$ sudo netstat -tonp | grep 6543
tcp       39      0 127.0.0.1:6543              127.0.0.1:51125             CLOSE_WAIT  -                   off (0.00/0/0)
tcp       39      0 127.0.0.1:6543              127.0.0.1:51126             CLOSE_WAIT  -                   off (0.00/0/0)
[jonathan@localhost ~]$

A little googling on the internet made me stumble on this: http://www.sunmanagers.org/pipermail/summaries/2006-January/007068.html

This seems to indicate that: "CLOSE_WAIT connections indicate an error in the software.

It's a connection which has been torn down but your side of things still has a filedescriptor open."

For the record the version numbers:

Backend:

[jonathan@localhost ~]$ mythbackend --version
Please attach all output as a file in bug reports.
MythTV Version : v0.25pre-4297-ga043706
MythTV Branch : master
Network Protocol : 72
Library API : 0.25.20120201-1
QT Version : 4.6.2
Options compiled in:
 linux profile use_hidesyms using_oss using_backend using_bindings_perl using_bindings_python using_bindings_php using_dvb using_frontend using_hdhomerun using_ceton using_hdpvr using_iptv using_ivtv using_joystick_menu using_libcrypto using_libudf using_lirc using_mheg using_opengl_video using_qtwebkit using_qtscript using_qtdbus using_v4l2 using_v4l1 using_x11 using_xrandr using_bindings_perl using_bindings_python using_bindings_php using_mythtranscode using_opengl using_ffmpeg_threads using_live using_mheg using_libudf
[jonathan@localhost ~]$

Frontend:

[jonathan@localhost ~]$ mythfrontend --version
Please attach all output as a file in bug reports.
MythTV Version : v0.25pre-4297-ga043706
MythTV Branch : master
Network Protocol : 72
Library API : 0.25.20120201-1
QT Version : 4.6.2
Options compiled in:
 linux profile use_hidesyms using_oss using_backend using_bindings_perl using_bindings_python using_bindings_php using_dvb using_frontend using_hdhomerun using_ceton using_hdpvr using_iptv using_ivtv using_joystick_menu using_libcrypto using_libudf using_lirc using_mheg using_opengl_video using_qtwebkit using_qtscript using_qtdbus using_v4l2 using_v4l1 using_x11 using_xrandr using_bindings_perl using_bindings_python using_bindings_php using_mythtranscode using_opengl using_ffmpeg_threads using_live using_mheg using_libudf
[jonathan@localhost ~]$

Is there anything I can do to debug this or help in troubleshooting this issue?

Change History (8)

comment:1 Changed 12 years ago by Jonathan Martens <jonathan@…>

While trying to investigate what is happening I tried running mythbackend under gdb but then the problem seems to be not present... any other tips on how to debug this?

comment:2 Changed 12 years ago by J.Pilk@…

I had port 6543 problems with Fedora 15 which turned out to be firewall related. Mike Dean suggested that this might apply to CentOS too. Apologies if you've looked at this. http://www.gossamer-threads.com/lists/mythtv/users/501792#501792

comment:3 in reply to:  2 Changed 12 years ago by Jonathan Martens <jonathan@…>

Thanks. I had not seen that thread, but since it is a combined frontend and backend machine I doubt it is firewall related, even more since the backend under gdb does work, where it does not when not run under gdb.

comment:4 Changed 12 years ago by Jonathan Martens <jonathan@…>

I just checked, even with iptables disabled, the symptoms are still the same.

comment:5 Changed 12 years ago by Jonathan Martens <jonathan@…>

Just to confirm that even the changes in 37385baff83eea77116d13e22be96b74cfde2cec do not fix this in my case. If I run MythtBackend? under gdb it does still work however.

comment:6 Changed 12 years ago by Jonathan Martens <jonathan@…>

I had already disabled selinux as I know this might interfere:

[root@c6-gui ~]# sestatus SELinux status: disabled [root@c6-gui ~]#

I have grepped the audit.log however and find entries for mythfrontend but not for mythbackend:

[root@c6-gui ~]# grep mythfrontend /var/log/audit/audit.log
type=ANOM_ABEND msg=audit(1328219124.691:4139): auid=500 uid=500 gid=500 ses=545 pid=22562 comm="mythfrontend" sig=11
type=ANOM_ABEND msg=audit(1328389705.213:6271): auid=500 uid=500 gid=500 ses=881 pid=5550 comm="mythfrontend" sig=11
type=ANOM_ABEND msg=audit(1328389937.011:6278): auid=500 uid=500 gid=500 ses=881 pid=5877 comm="mythfrontend" sig=11
type=ANOM_ABEND msg=audit(1328465810.800:7441): auid=500 uid=500 gid=500 ses=1042 pid=20176 comm="mythfrontend" sig=11
type=ANOM_ABEND msg=audit(1328466037.585:7448): auid=500 uid=500 gid=500 ses=1042 pid=20217 comm="mythfrontend" sig=11
type=ANOM_ABEND msg=audit(1328466318.192:7456): auid=500 uid=500 gid=500 ses=1042 pid=25789 comm="mythfrontend" sig=11
[root@c6-gui ~]#
[root@c6-gui ~]# grep mythbackend /var/log/audit/audit.log
[root@c6-gui ~]#

comment:7 Changed 12 years ago by Jonathan Martens <jonathan@…>

This might be a duplicate of #10265 as http://code.mythtv.org/trac/ticket/10265#comment:9 mentions messagebus being part of the problem. Since I could not run X windows on CentOS 6 with messagebus disabled I created a separate frontend and backend and disabled messagebus on the backend. After that all seems to be OK, frontend and backend communicate and lockups do not occur anymore.

Are there any pointers on how to investigate this next? I would like to be able to run the next release on CentOS 6/EL6.

comment:8 Changed 12 years ago by sphery

Resolution: Duplicate
Status: newclosed

Marking as a dup of #10265 based on comment:7

Note: See TracTickets for help on using tickets.