less 21353 root 4r REG 252,2 27 260360 /opt/test/test (deleted) <<<<<<<<<<<<<
Understanding output of "lsof" command:
COMMAND: Command using the file.
PID: PID of the file
USER: Owner of the file
FD: File descriptor. Different flags of File descriptor are as below:
#: The number in front of flag(s) is the file descriptor number of used by the process to associated with the file
u: File open with Read and Write permission
r: File open with Read permission
w: File open with Write permission
W: File open with Write permission and with Write Lock on entire file
mem: Memory mapped file, usually for share library
TYPE: File type. Different flags of File type are as below:
REG - Regular file
DIR - Directory
DEVICE: major, minor number of the device where file resides.
SIZE/OFF: File size
NODE: inode number
NAME: File name
Now we know that process 21353 still has the file open, and the file descriptor is 4.
Now we can look into /proc and there will be a reference to the inode, from which the deleted file can be copied.
Following steps will help to recover the deleted files:
# ls -l /proc/21353/fd/4
lr-x------ 1 root root 64 Sep 16 05:28 /proc/21353/fd/4 -> /opt/test/test (deleted)
# cp /proc/21353/fd/4 /opt/test/test.bkp
Now verify the content of the restored file.
Note: Don't use the -a flag with cp, as this will copy the (broken) symbolic link, rather than the actual file contents.
另外,找到是某个进程持有的文件,通过下面的方法可以看到这个进程相关的环境信息:
Checking the environment variables of ASM pmon process: It shows ORACLE_HOME is set to /oracle_grid/product/11.2.0.3/grid/ ( with 'slash' at the end )
# ps -ef | grep pmon
oracle 27232 1 0 May30 ? 00:07:05 asm_pmon_+ASM1
# cat /proc/27232/environ
__CLSAGFW_TYPE_NAME=ora.asm.typeORA_CRS_HOME=/oracle_grid/product/11.2.0.3/grid/HOSTNAME=aude3od015naboi.basdev.aurdev.national.com.auTERM=xtermSHELL=/bin/bash__CR......
总结:对于此类问题,我们首先要明白为什么df和du在空间计算上有所差别,其次要熟悉lsof和fuser两个命令,找出继续持有文件的进程号,通过该进程号可以在/proc目录下恢复文件,查看进程的环境信息,甚至杀掉进程来释放空间。
最后通过一个简单的例子来结束这篇文章:
1.首先确保lsof工具已经安装到操作系统。
[root@rac01 Server]# rpm -ivh lsof-4.78-6.x86_64.rpm
Preparing... ########################################### [100%]
1:lsof ########################################### [100%]
[root@rac01 Server]# which lsof
/usr/sbin/lsof
2.在其中一个会话通过tail -f install2.log命令使tail进程持有该文件,在另一个会话通过rm -rf install2.log命令删除该文件。
3.使用lsof执行如下的操作:
[root@rac01 ~]# lsof | grep deleted
tail 6006 root 3r REG 8,3 29544 4587629 /root/install2.log (deleted)
[root@rac01 ~]# cd /proc/6006/
[root@rac01 6006]# ls
attr cmdline cwd fdinfo loginuid mounts numa_maps pagemap schedstat stat task
auxv comm environ io maps mountstats oom_adj personality sessionid statm wchan
cgroup coredump_filter exe latency mem net oom_score root smaps status
clear_refs cpuset fd limits mountinfo ns oom_score_adj sched stack syscall
[root@rac01 6006]# cd fd
[root@rac01 fd]# ll
total 0
lrwx------ 1 root root 64 Dec 3 19:07 0 -> /dev/pts/0
lrwx------ 1 root root 64 Dec 3 19:07 1 -> /dev/pts/0
lrwx------ 1 root root 64 Dec 3 19:07 2 -> /dev/pts/0
lr-x------ 1 root root 64 Dec 3 19:07 3 -> /root/install2.log (deleted)
[root@rac01 fd]# cd ..
[root@rac01 6006]# cat environ
HOSTNAME=rac01TERM=vt100SHELL=/bin/bashHISTSIZE=1000SSH_CLIENT=172.168.4.123 56823 22OLDPWD=/mnt/ServerSSH_TTY=/dev/pts/0USER=rootLS_COLORS=no=00:fi=00:di=01;34:ln=01;36:pi=40;33:so=01;35:bd=40;33;01:cd=40;33;01:or=01;05;37;41:mi=01;05;37;41:ex=01;32:*.cmd=01;32:*.exe=01;32:*.com=01;32:*.btm=01;32:*.bat=01;32:*.sh=01;32:*.csh=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.gz=01;31:*.bz2=01;31:*.bz=01;31:*.tz=01;31:*.rpm=01;31:*.cpio=01;31:*.jpg=01;35:*.gif=01;35:*.bmp=01;35:*.xbm=01;35:*.xpm=01;35:*.png=01;35:*.tif=01;35:MAIL=/var/spool/mail/rootPATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/binINPUTRC=/etc/inputrcPWD=/rootLANG=en_US.UTF-8SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpassSHLVL=1HOME=/rootLOGNAME=rootSSH_CONNECTION=172.168.4.123 56823 172.168.4.200 22LESSOPEN=|/usr/bin/lesspipe.sh %sG_BROKEN_FILENAMES=1_=/usr/bin/tail
[root@rac01 6006]# lsof /root/ <<<< 找出持有/root目录下文件的进程
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
bash 4934 root cwd DIR 8,3 4096 4587521 /root/
tail 6006 root cwd DIR 8,3 4096 4587521 /root/
[root@rac01 6006]# lsof -c tail <<<< 找出tail进程持有的文件
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
tail 6006 root cwd DIR 8,3 4096 4587521 /root
tail 6006 root rtd DIR 8,3 4096 2 /
tail 6006 root txt REG 8,3 37704 1448826 /usr/bin/tail
tail 6006 root mem REG 8,3 56479136 1446088 /usr/lib/locale/locale-archive
tail 6006 root mem REG 8,3 1720736 5242891 /lib64/libc-2.5.so
tail 6006 root mem REG 8,3 142488 5242884 /lib64/ld-2.5.so
tail 6006 root 0u CHR 136,0 0t0 3 /dev/pts/0
tail 6006 root 1u CHR 136,0 0t0 3 /dev/pts/0
tail 6006 root 2u CHR 136,0 0t0 3 /dev/pts/0
tail 6006 root 3r REG 8,3 29544 4587629 /root/install2.log (deleted)
[root@rac01 6006]# lsof +d /root/ <<<< 显示访问/root目录下文件的进程。
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
bash 4934 root cwd DIR 8,3 4096 4587521 /root/
tail 6006 root cwd DIR 8,3 4096 4587521 /root/
[root@rac01 6006]# lsof +D /root/ <<<< 显示访问/root目录及子目录下文件的进程。
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
bash 4934 root cwd DIR 8,3 4096 4587521 /root/
tail 6006 root cwd DIR 8,3 4096 4587521 /root/
[root@rac01 6006]# lsof -d 3 | grep -v grep | grep deleted <<<< 显示持有文件FD为3的进程文件
tail 6006 root 3r REG 8,3 29544 4587629 /root/install2.log (deleted)
[root@rac01 6006]# lsof -p 6006 <<<< 显示6006进程持有的文件
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
tail 6006 root cwd DIR 8,3 4096 4587521 /root
tail 6006 root rtd DIR 8,3 4096 2 /
tail 6006 root txt REG 8,3 37704 1448826 /usr/bin/tail
tail 6006 root mem REG 8,3 56479136 1446088 /usr/lib/locale/locale-archive
tail 6006 root mem REG 8,3 1720736 5242891 /lib64/libc-2.5.so
tail 6006 root mem REG 8,3 142488 5242884 /lib64/ld-2.5.so
tail 6006 root 0u CHR 136,0 0t0 3 /dev/pts/0
tail 6006 root 1u CHR 136,0 0t0 3 /dev/pts/0
tail 6006 root 2u CHR 136,0 0t0 3 /dev/pts/0
tail 6006 root 3r REG 8,3 29544 4587629 /root/install2.log (deleted)
[root@rac01 6006]# lsof -u root | grep deleted <<<< 显示以root用户持有的进程文件
tail 6006 root 3r REG 8,3 29544 4587629 /root/install2.log (deleted)
[root@rac01 6006]# cd /proc/6006/fd/
[root@rac01 fd]# cp 3 /root/install2.log <<<< 恢复删除的install2.log文件
[root@rac01 fd]# cd /proc/6006/fd
[root@rac01 fd]# ll
total 0
lrwx------ 1 root root 64 Dec 3 19:07 0 -> /dev/pts/0
lrwx------ 1 root root 64 Dec 3 19:07 1 -> /dev/pts/0
lrwx------ 1 root root 64 Dec 3 19:07 2 -> /dev/pts/0
lr-x------ 1 root root 64 Dec 3 19:07 3 -> /root/install2.log (deleted) <<<< 文件恢复后deleted状态未发生变化。
[root@rac01 fd]# cd /root/
[root@rac01 ~]# ls
anaconda-ks.cfg core.11067 core.526 core.6027 core.7965 Desktop install2.log install.log install.log.syslog
--end--