Discussion:
init problem
Chris Mason
2002-07-11 19:54:09 UTC
Permalink
Hi,

All of a sudden out of the blue init decides to be nasty. I logged onto
my Sparc 5 and did a process list, I was presented with the following
response:

mail 9078 1 0 May31 ? 00:03:21 [exim <defunct>]
root 2556 1 0 Jun23 ? 00:02:57 [nmbd <defunct>]
root 2558 1 0 Jun23 ? 00:00:00 [nmbd <defunct>]
root 2561 1 0 Jun23 ? 00:00:08 [smbd <defunct>]
root 25796 1 0 Jun27 ? 00:00:09 [smbd <defunct>]
root 26328 1 0 Jun27 ? 00:00:02 [sshd <defunct>]
root 31042 1 0 Jun28 ? 00:00:00 [init <defunct>]
root 31749 1 0 Jun28 ? 00:08:07 [nmbd <defunct>]
root 31754 1 0 Jun28 ? 00:00:17 [smbd <defunct>]
chris 18691 1 0 Jul01 ? 00:00:17 [cc1 <defunct>]
sshd 6549 1 0 Jul04 ? 00:00:07 [sshd <defunct>]
root 6766 1 0 Jul04 ? 00:00:12 [apache <defunct>]
root 19226 1 0 Jul06 ? 00:00:01 [sshd <defunct>]
chris 19228 1 0 Jul06 ? 00:00:01 [sshd <defunct>]
mail 19750 1 0 Jul06 ? 00:00:00 [exim <defunct>]
mail 19756 1 0 Jul07 ? 00:00:00 [exim <defunct>]
mail 19800 1 0 Jul07 ? 00:00:00 [exim <defunct>]
chris 20085 1 0 Jul07 ? 00:00:03 [ssh <defunct>]
root 21126 1 0 Jul07 ? 00:00:35 [smbd <defunct>]
chris 21155 1 0 Jul07 ? 00:00:03 [ssh <defunct>]
chris 21471 1 0 Jul07 ? 00:00:03 [ssh <defunct>]
mail 23073 1 0 Jul08 ? 00:00:00 [exim <defunct>]
mail 23080 1 0 Jul08 ? 00:00:00 [exim <defunct>]

I am quite aware that once init has taken ownership of the process it is
quite impossible to kill it. I read somewhere that by switching to
single user mode then back again it may rectify the situation. Before
attempting to switch to init 1 I decided to attempt to restart init. On
issueing `telinit q` I got the following response:

telinit: timeout opening/writing control channel /dev/initctl

This box is running Debian Woody with the latest packages installed and
kernel 2.2.20. I know the solution is to reboot the box, but with an
uptime of 99 days I would rather fix it some other way. I do not
believe in rebooting boxes unless it is because of a relocation.

The reason I mention this is because I feel this must be a bug with
init, or something else and was wondering if anyone else had seen this.

Kind Regards,
Chris Mason
--
To UNSUBSCRIBE, email to debian-sparc-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Chris Mason
2002-07-12 13:59:46 UTC
Permalink
I decided to attempt to reboot the box and I got the following:

***@hantu:/var/log# shutdown -r now

Broadcast message from root (pts/0) (Fri Jul 12 14:57:07 2002):

The system is going down for reboot NOW!
init: timeout opening/writing control channel /dev/initctl
$

The box will not shutdown.

I also noticed that on the list of Zombied processes was an instance of
init.

Chris

-----Original Message-----
From: Chris Mason [mailto:***@bash.sh]
Sent: 11 July 2002 20:54
To: debian-***@lists.debian.org
Subject: init problem


Hi,

All of a sudden out of the blue init decides to be nasty. I logged onto
my Sparc 5 and did a process list, I was presented with the following
response:

mail 9078 1 0 May31 ? 00:03:21 [exim <defunct>]
root 2556 1 0 Jun23 ? 00:02:57 [nmbd <defunct>]
root 2558 1 0 Jun23 ? 00:00:00 [nmbd <defunct>]
root 2561 1 0 Jun23 ? 00:00:08 [smbd <defunct>]
root 25796 1 0 Jun27 ? 00:00:09 [smbd <defunct>]
root 26328 1 0 Jun27 ? 00:00:02 [sshd <defunct>]
root 31042 1 0 Jun28 ? 00:00:00 [init <defunct>]
root 31749 1 0 Jun28 ? 00:08:07 [nmbd <defunct>]
root 31754 1 0 Jun28 ? 00:00:17 [smbd <defunct>]
chris 18691 1 0 Jul01 ? 00:00:17 [cc1 <defunct>]
sshd 6549 1 0 Jul04 ? 00:00:07 [sshd <defunct>]
root 6766 1 0 Jul04 ? 00:00:12 [apache <defunct>]
root 19226 1 0 Jul06 ? 00:00:01 [sshd <defunct>]
chris 19228 1 0 Jul06 ? 00:00:01 [sshd <defunct>]
mail 19750 1 0 Jul06 ? 00:00:00 [exim <defunct>]
mail 19756 1 0 Jul07 ? 00:00:00 [exim <defunct>]
mail 19800 1 0 Jul07 ? 00:00:00 [exim <defunct>]
chris 20085 1 0 Jul07 ? 00:00:03 [ssh <defunct>]
root 21126 1 0 Jul07 ? 00:00:35 [smbd <defunct>]
chris 21155 1 0 Jul07 ? 00:00:03 [ssh <defunct>]
chris 21471 1 0 Jul07 ? 00:00:03 [ssh <defunct>]
mail 23073 1 0 Jul08 ? 00:00:00 [exim <defunct>]
mail 23080 1 0 Jul08 ? 00:00:00 [exim <defunct>]

I am quite aware that once init has taken ownership of the process it is
quite impossible to kill it. I read somewhere that by switching to
single user mode then back again it may rectify the situation. Before
attempting to switch to init 1 I decided to attempt to restart init. On
issueing `telinit q` I got the following response:

telinit: timeout opening/writing control channel /dev/initctl

This box is running Debian Woody with the latest packages installed and
kernel 2.2.20. I know the solution is to reboot the box, but with an
uptime of 99 days I would rather fix it some other way. I do not
believe in rebooting boxes unless it is because of a relocation.

The reason I mention this is because I feel this must be a bug with
init, or something else and was wondering if anyone else had seen this.

Kind Regards,
Chris Mason
--
To UNSUBSCRIBE, email to debian-sparc-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact
***@lists.debian.org
--
To UNSUBSCRIBE, email to debian-sparc-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Michael Hicks
2002-07-12 17:14:23 UTC
Permalink
"Chris Mason" <***@bash.sh> wrote:
[snip]
Post by Chris Mason
The box will not shutdown.
I also noticed that on the list of Zombied processes was an instance of
init.
I saw this happen on one of my boxes a few weeks ago (uptime currently
shows 31 days, which sounds about right). I have a suspicion that init was
upgraded, but the new version of init didn't safely replace the running
version. Other than that, I have no ideas. I believe I power-cycled the
box. It's been fine since.

I've been running one of the 2.4 (2.4.17 or 2.4.18) kernels from Woody. I
think the box's uptime was pretty high (it had been going for a few months,
I think).
--
Mike Hicks [mailto:***@csom.umn.edu]
Unix Support Assistant | Carlson School of Management
Office: 1-160 Phone: 6-7909 | University of Minnesota
Loading...