原文发表于:
年会马上开始,希望抽个大奖。
对于软件开发(尤其是后台开发)工程师而言,不仅需要关注服务器的CPU,还需要经常关注服务器的内存。
在之前的文章中,我们探讨了CPU的参数,并且弄清了CPU核相关的一些概念。在本文中,我们来聊内存。
我的云服务器
买了一台老东家的云服务器,每年三四百块钱,基本参数如下:
可以看到,内存1G,即1024M. 当我用free命令查看时,结果如下:
ubuntu@VM-0-15-ubuntu:~$ free -wh
total used free shared buffers cache available
Mem: 864M 403M 79M 20M 60M 321M 253M
Swap: 0B 0B 0B
ubuntu@VM-0-15-ubuntu:~$
我买的内存明明是1G, 怎么total才864M呢?原因是:服务器启动时,会初始化相关设备,会占用内存。而且,linux内核启动时,也会占用一部分内存。那个total 864M是指能被应用程序使用的内存。
如果要看最原始的内存,可用dmidecode命令来查看。看到最大可能的内存是1G(所有内存插槽之和),我的云服务器实际只有1个内存条,这个内存条刚好是1024M, 所以,这台云服务器的总实际物理内存就是1024M:
ubuntu@VM-0-15-ubuntu:~$ sudo dmidecode -t memory
# dmidecode 3.0
Getting SMBIOS data from sysfs.
SMBIOS 2.4 present.
Handle 0x1000, DMI type 16, 15 bytes
Physical Memory Array
Location: Other
Use: System Memory
Error Correction Type: Multi-bit ECC
Maximum Capacity: 1 GB
Error Information Handle: Not Provided
Number Of Devices: 1
Handle 0x1100, DMI type 17, 21 bytes
Memory Device
Array Handle: 0x1000
Error Information Handle: 0x0F01
Total Width: 64 bits
Data Width: 64 bits
Size: 1024 MB
Form Factor: DIMM
Set: None
Locator: DIMM 0
Bank Locator: Not Specified
Type: RAM
Type Detail: None
ubuntu@VM-0-15-ubuntu:~$
新版linux的free命令
我注意到, 跟几年前相比,free命令展示的内容和含义有变化。直接来看新版linux的free命令介绍:
free displays the total amount of free and used physical and swap memory in the system, as well as the buffers
and caches used by the kernel. The information is gathered by parsing /proc/meminfo. The displayed columns are:
total Total installed memory (MemTotal and SwapTotal in /proc/meminfo)
used Used memory (calculated as total - free - buffers - cache)
free Unused memory (MemFree and SwapFree in /proc/meminfo)
shared Memory used (mostly) by tmpfs (Shmem in /proc/meminfo, available on kernels 2.6.32, displayed as zero if
not available)
buffers
Memory used by kernel buffers (Buffers in /proc/meminfo)
cache Memory used by the page cache and slabs (Cached and Slab in /proc/meminfo)
buff/cache
Sum of buffers and cache
available
Estimation of how much memory is available for starting new applications, without swapping. Unlike the
data provided by the cache or free fields, this field takes into account page cache and also that not
all reclaimable memory slabs will be reclaimed due to items being in use (MemAvailable in /proc/meminfo,
available on kernels 3.14, emulated on kernels 2.6.27+, otherwise the same as free)
可见,free命令从/proc/meminfo获取信息。而且,有如下关系式:
Your RAM is OK !
看到free才79M, 是不是说明内存快用完了呢?别瞎紧张兮兮的。
ubuntu@VM-0-15-ubuntu:~$ free -wh
total used free shared buffers cache available
Mem: 864M 403M 79M 20M 60M 321M 253M
Swap: 0B 0B 0B
ubuntu@VM-0-15-ubuntu:~$
直接看英文解释,地址如下:
https://www.linuxatemyram.com/
1.What's going on?
Linux is borrowing unused memory for disk caching. This makes it looks like you are low on memory, but you are not! Everything is fine!
2.Why is it doing this?
Disk caching makes the system much faster and more responsive! There are no downsides, except for confusing newbies. It does not take memory away from applications in any way, ever!
3.What if I want to run more applications?
If your applications want more memory, they just take back a chunk that the disk cache borrowed. Disk cache can always be given back to applications immediately! You are not low on ram!
Do I need more swap?
No, disk caching only borrows the ram that applications don't currently want. It will not use swap. If applications want more memory, they just take it back from the disk cache. They will not start swapping.
4.How do I stop Linux from doing this?
You can't disable disk caching. The only reason anyone ever wants to disable disk caching is because they think it takes memory away from their applications, which it doesn't! Disk cache makes applications load faster and run smoother, but it NEVER EVER takes memory away from them! Therefore, there's absolutely no reason to disable it!
不同视角下的buffers/cache
那么,应用程序和linux会怎样看待buffers/cache呢?如下图:
在旧版linux中,free命令的结果把buffers/cache归纳到used中。可见,这是站在linux视角的(如下图片是另外一台服务器):
而在新版linux中,free命令的结果没有把buffers/cache归纳到used中。可见,这是站在应用程序角度的,如下:
ubuntu@VM-0-15-ubuntu:~$ free -wh
total used free shared buffers cache available
Mem: 864M 403M 79M 20M 60M 321M 253M
Swap: 0B 0B 0B
ubuntu@VM-0-15-ubuntu:~$
所以,在新版linux free命令的结果中,有如下关系式:
total = used + free + buffers + cache
站在应用程序的角度,可用内存为:free加buffers/cache之和。然而,实际上,只有一部分buffers/cache能被应用程序使用, 所以一般会有:
available < free + buffers + cache
顺便说一声,在后续的叙述中,我们都是针对新版linux的free命令。而且,我们在本文中不讨论共享内存(Shared Memory)。
内存消耗的程序验证
我们用程序来验证一下内存的消耗过程(不考虑虚拟内存):
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int main()
{
int mb = 0;
char* buffer;
while((buffer=malloc(1024 * 1024 * 10)) != NULL)
{
memset(buffer, 0, 1024 * 1024 * 10);
mb += 10;
printf("Allocated %d MB\n", mb);
sleep(1);
}
return 0;
}
运行程序,可以看到free被不断消耗, available也随之变化:
free的值被消耗完毕后,buffers和cache开始出场了,承担起了被消耗的重任。但是,要注意,并不是所有的buffers和cache都能被消耗殆尽的,所以,最后进程OOM时,cache还有几十M, 而available几乎没有了,如下图所示:
我们看到进程被kill了:
dmesg OOM信息如下:
虚拟内存消耗的程序验证
你不会以为linux真的这么弱吧?申请290M内存后就扛不住了?简直笑话
来看free命令结果的Swap信息,可以看到,都是0:
ubuntu@VM-0-15-ubuntu:~$ free -wh
total used free shared buffers cache available
Mem: 864M 449M 234M 20M 8.1M 171M 238M
Swap: 0B 0B 0B
也就是说,没有开启linux系统的虚拟内存。好的,那我们来开启2G的虚拟内存,如下:
ubuntu@VM-0-15-ubuntu:~$ free -wm
total used free shared buffers cache available
Mem: 864 439 240 20 10 174 246
Swap: 2047 0 2047
然后,我们来运行程序,并观察内存消耗。可以看到,最开始消耗的是available内存:
当available内存消耗到一定程度后,才开始使用虚拟内存:
当虚拟内存使用完毕后,实在没办法,又开始去消耗available内存,而当available内存被耗尽后,进程被OOM机制给kill了, 然后各内存被释放出来:
可以看到,进程申请了2G多的内存后,才被kill:
dmesg OOM信息如下:
结合上述实验,我们可以得知:
1. free接近0时,别紧张。
2. available接近0时,有问题。
3. OOM了,有问题。
生产环境的线上的服务器,有的开启了虚拟内存,有的没有开启虚拟内存。我刚看了一台线上服务器,没有开启虚拟内存,其内存插槽共144个,每个32G, 故总共支持4608G的内存。该服务器实际插了4个内存条,故总实际物理内存是128G,除系统占用的内存外,留给应用程序的总内存是125G,available内存是109G, 这内存真是够多的啊:
taoge:~$ sudo dmidecode -t memory | grep GB
Maximum Capacity: 4608 GB
Size: 32 GB
Size: 32 GB
Size: 32 GB
Size: 32 GB
taoge:~$ free -wh
total used free shared buffers cache available
Mem: 125G 20G 7.0G 1.3G 369M 97G 109G
Swap: 0B 0B 0B
操作系统的虚拟内存机制,很好很强大。在后面的文章后,我们也会再次谈到这些东西。
最后,我们来看下linux中常用的top命令。在之前的文章中,我们用过top命令来动态观察CPU和进程(ps命令也可以观察进程),其实,top命令还可以用来动态观察内存。由于top命令比较简单,所以,我们就不详细介绍了,仅来欣赏一下top命令执行后的结果图:
好的,本文先说到这里,下次再聊。 年会马上抽奖啦,我要中奖。