Something"s eating my memory...
am 27.08.2003 12:28:56 von Paul Furness
Hi.
Can someone help me? I'm running out of memory on my production server,
and I can't figure out exactly why.
It's a pretty new machine (about 4 months old), and the spec is: dual
2.8GHz Xeon CPU, 1G memory, SCSI hard disks, and an external SCSI RAID
controller. It is based on a build of RedHat 7.3 + redhat released
patches, and I have added the 2.4.21 kernel patched to support LVM and
XFS.
The primary (only?!) job of the serve is to be a file server, offering
nfs and samba shares to servers and workstations; this includes the
users' home directories.
When I first built it, it worked like a dream, but of course it wasn't
under a big load. Over time (about a month) I ramped up the load by
adding the various shares to the machine and making them available to
users.
Over the last week, there have been a number of occasions when it ground
almost to a complete halt; the rest of the time it performed just fine.
It looks like it's having trouble when it gets hammered by everyone
logging out at the end of the day (We have roaming profiles on windows
workstations, using a samba domain controller. As an aside: it works
really well; I can't imagine why anyone would ever want an actual
windows server... ;). Anyhow, this heavy loading is to be expected.
When I run top or free, it tells me that almost all of the memory is
used, but it doesn't seem to be actually used by anything; the total
memory used by the processes that top is showing is about 80M. Buffers
is showing up as anything between about 450M and 700M. Clearly, the
performance issue is happening when the memory fills up and it starts
swapping. Here's an example output from free:
total used free shared buffers
cached
Mem: 1032104 1019200 12904 0 2100
342172
-/+ buffers/cache: 674928 357176
Swap: 2096472 3508 2092964
I don't mind putting more memory into the server if this is the
solution, but I need to be sure that it will actually help - if I put in
another G and it fills up just the same, I'm back where I started but a
little bit poorer!
My problems are:
1. I don't really understand how the buffers are allocated, or why, and
whether changing this would help performance.
2. There seems to be at least 150-200M of memory that I can't account
for.
Can anyone point me to where I can find out about the buffers, what they
are and how they work?
Can anyone suggest some accurate performance monitoring software that I
can use to find out what exactly is happening when the server grinds to
a halt? I guess I really need to know where the memory is going and
possibly the disk activity. Gkrellm is sort of useful, but I really need
something a bit more determined :)
As always, any and all suggestions much appreciated.
Paul.
-
To unsubscribe from this list: send the line "unsubscribe linux-admin" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Something"s eating my memory...
am 27.08.2003 13:12:04 von Benjamin Walkenhorst
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello Paul,
Unfortunately, I am not able to give you definite answers, but maybe some
hints...
On Mittwoch, 27. August 2003 12:28 Paul Furness wrote:
> It's a pretty new machine (about 4 months old), and the spec is: dual
> 2.8GHz Xeon CPU, 1G memory, SCSI hard disks, and an external SCSI RAID
> controller. It is based on a build of RedHat 7.3 + redhat released
> patches, and I have added the 2.4.21 kernel patched to support LVM and
> XFS.
a) What motherboard / NICs / controller(s) does the machine have? I'm not
sure how much this applies for SMP-board, but sometimes a bad motherboard can
badly screw up system performance, I guess same goes for other components
when hitting a sudden peak in system load.
b) Correct me if I'm wrong, but doesn't 2.4.21 include native support for LVM
and XFS? Or do you need a patch to use XFS on LVM-volumes?
> The primary (only?!) job of the serve is to be a file server, offering
> nfs and samba shares to servers and workstations; this includes the
> users' home directories.
It appears to me like a dual Xeon 2.8 is a little much for a file server, how
many clients does it have to server?
Do you have all the latest patches installed? RH7.3 is quite old, I think,
and there have been some patches (especially to Samba, I think) which I
understand to vastly improve Samba-performance, especially under heavy load.
> When I first built it, it worked like a dream, but of course it wasn't
> under a big load. Over time (about a month) I ramped up the load by
> adding the various shares to the machine and making them available to
> users.
>
> Over the last week, there have been a number of occasions when it ground
> almost to a complete halt; the rest of the time it performed just fine.
> It looks like it's having trouble when it gets hammered by everyone
> logging out at the end of the day (We have roaming profiles on windows
> workstations, using a samba domain controller. As an aside: it works
> really well; I can't imagine why anyone would ever want an actual
> windows server... ;). Anyhow, this heavy loading is to be expected.
I think your memory issue is not directly causing this performance problems.
I do not know how exactly Samba and NFS works, but maybe the client sends all
its remaining write-data to the server, or maybe the server flushes its file
buffers at logout?
>
> When I run top or free, it tells me that almost all of the memory is
> used, but it doesn't seem to be actually used by anything; the total
> memory used by the processes that top is showing is about 80M. Buffers
> is showing up as anything between about 450M and 700M. Clearly, the
> performance issue is happening when the memory fills up and it starts
> swapping. Here's an example output from free:
>
> total used free shared buffers
> cached
> Mem: 1032104 1019200 12904 0 2100
> 342172
> -/+ buffers/cache: 674928 357176
> Swap: 2096472 3508 2092964
>
> I don't mind putting more memory into the server if this is the
> solution, but I need to be sure that it will actually help - if I put in
> another G and it fills up just the same, I'm back where I started but a
> little bit poorer!
You should take a look at output vom top or ps, looking at what processes use
up how much memory.
Having lots of memory used for cache and buffers seems fine to me, because it
reduces load on the hard drives.
I think that the system may be rather lazy about flushing its buffers, piling
up lots of changes to the file system. Now lots of clients disconnect
simultaneously, forcing or just motivating the system to flush its
write-buffers, which may in fact slow down the system quite a bit.
Or maybe, the users all save their work before/at logging out. This means
that all day you get a write request every now and then, but a huge lot of
request all at once at the end of the day.
I suggest you give the Samba and NFS servers an upgrade and maybe install
flushd (as an alternative, you can place an entry in crontab, running "sync"
at regular intervals.
- From what I read I understand that Linux-2.6 will also come with much
improved memory managment, but who knows when?
You could also run process accounting, but I don't know much about that.
As an alternative you could try FreeBSD; it's a) very good in overall
performance, as well as under heavy load, and b) very good in memory
managment. It also has a software RAID driver, NFS and Samba are available
too (surprise...), and it also supports SMP. So I think it might do the job
just as well.
(On the other hand it completely lacks GUI-based administration tools, but
system configuration is much more tidy than with Linux.)
> As always, any and all suggestions much appreciated.
>
> Paul.
- --
Benjamin Walkenhorst
eMail: krylon@gmx.net
homepage: http://www.krylon.de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)
Comment: Public Key available at http://www.krylon.de
iD8DBQE/TJIKoYumWdMvhMQRAhhwAJ0U9lz7e7Qd++E7OZ4GqMPPDsJanQCf dTlv
KdxC3aIghZMDAihphcSU8HE=
=L4WB
-----END PGP SIGNATURE-----
-
To unsubscribe from this list: send the line "unsubscribe linux-admin" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Something"s eating my memory...
am 27.08.2003 13:15:13 von Bruce Ferrell
When I run into this type of problem, I turn to sysstat. I use it in
conjunction with bigbrother/larrd/rrdtool and a bigbrother plugin called
bb-sar from www.deadcat.net.
I almost missed the nfs aspect of this. Is the system load suddenly sky
rocketing? I would start looking for something NFS has exported being
"taken" away. I saw something similar to this on a mail server with
autofs mounted home directories. We had an admin who liked to rearrange
home directories. It caused the system to get very confused and be very
unhappy as NFS, even with soft mounts and interruptable flags set would
keep pounding for the old, missing location.
This doesn't look like a memory allocation problem at all. If kernel
logging isn't enabled, enable it. You'll get some good clues there.
Paul Furness wrote:
> Hi.
>
> Can someone help me? I'm running out of memory on my production server,
> and I can't figure out exactly why.
>
> It's a pretty new machine (about 4 months old), and the spec is: dual
> 2.8GHz Xeon CPU, 1G memory, SCSI hard disks, and an external SCSI RAID
> controller. It is based on a build of RedHat 7.3 + redhat released
> patches, and I have added the 2.4.21 kernel patched to support LVM and
> XFS.
>
> The primary (only?!) job of the serve is to be a file server, offering
> nfs and samba shares to servers and workstations; this includes the
> users' home directories.
>
> When I first built it, it worked like a dream, but of course it wasn't
> under a big load. Over time (about a month) I ramped up the load by
> adding the various shares to the machine and making them available to
> users.
>
> Over the last week, there have been a number of occasions when it ground
> almost to a complete halt; the rest of the time it performed just fine.
> It looks like it's having trouble when it gets hammered by everyone
> logging out at the end of the day (We have roaming profiles on windows
> workstations, using a samba domain controller. As an aside: it works
> really well; I can't imagine why anyone would ever want an actual
> windows server... ;). Anyhow, this heavy loading is to be expected.
>
> When I run top or free, it tells me that almost all of the memory is
> used, but it doesn't seem to be actually used by anything; the total
> memory used by the processes that top is showing is about 80M. Buffers
> is showing up as anything between about 450M and 700M. Clearly, the
> performance issue is happening when the memory fills up and it starts
> swapping. Here's an example output from free:
>
> total used free shared buffers
> cached
> Mem: 1032104 1019200 12904 0 2100
> 342172
> -/+ buffers/cache: 674928 357176
> Swap: 2096472 3508 2092964
>
> I don't mind putting more memory into the server if this is the
> solution, but I need to be sure that it will actually help - if I put in
> another G and it fills up just the same, I'm back where I started but a
> little bit poorer!
>
> My problems are:
> 1. I don't really understand how the buffers are allocated, or why, and
> whether changing this would help performance.
> 2. There seems to be at least 150-200M of memory that I can't account
> for.
>
> Can anyone point me to where I can find out about the buffers, what they
> are and how they work?
>
> Can anyone suggest some accurate performance monitoring software that I
> can use to find out what exactly is happening when the server grinds to
> a halt? I guess I really need to know where the memory is going and
> possibly the disk activity. Gkrellm is sort of useful, but I really need
> something a bit more determined :)
>
> As always, any and all suggestions much appreciated.
>
> Paul.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-admin" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
-
To unsubscribe from this list: send the line "unsubscribe linux-admin" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html