# Not able to login as root and home directories missing

## manu_leo

Hi Experts, this is completely unbelievable for me - All was working fine till 3 days back when I started to observe that I am not able to login as a root user. I login through some other user and reset the password for it and then I am able to login, but the next day same issue -not able to login as root but if I reset the password, it works. This is going on for 3 days now. All of a sudden, a normal user is now not able to login and when I reset his password, his home directory is missing completely from the server. This has happened for 2 users so far where there complete home directory is not there under /home/

Is this some kind of attach on my network or servers ?? Anyone else faced similar kind of issue ?

Where can I find the logs for it because I see nothing under /var/log/messages

Appreciate your prompt response and thanks in advance.

----------

## NeddySeagoon

manu_leo,

Its unlikely to be an attack.  The last thing an attacker wants is to alert you that your system has been compromised.

I suspect HDD or RAM issues.  Check dmesg.

Does the 

```
lastlog
```

 output look correct.

Do as little as possible that will cause filesystem writes.

Does 

```
mount
```

  show any of your filesystems as read only?

That can indicate a filesystem or underlying HD issue.

Do not be tempted to run fsck. It can make a bad situation worse.  You need a backup of some sort before you use fsck.

Silly question time ... is/was /home mounted over NFS ?

Is it still mounted?

If all that looks good, boot directly into memtest86 and let it run several cycles.

If it finds nothing, that would indicate that your RAM and RAM controller are probably not at fault.

(Absence of evidence is not evidence of absence)

Next test is to run 

```
smartctl -a
```

 on the drive.

Yeu need to use a live distro for that if you don't have it already.

Minimising writes means that you must not use emerge.

----------

## manu_leo

Thanks Neddy for the detailed reply.

dmesg - [1621203.196342] Paper[9449]: segfault at be0522 ip 00007ff5e25a9e47 sp 00007ffef25f8d08 error 6 in libc-2.23.so[7ff5e2523000+190000]

```
[   22.352433] exanic 0000:05:00.0 enp5s0: interface opened

[   22.352444] IPv6: ADDRCONF(NETDEV_UP): enp5s0: link is not ready

[   22.796144] IPv6: ADDRCONF(NETDEV_CHANGE): enp5s0: link becomes ready

[   23.120582] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory

[   23.129165] NFSD: starting 90-second grace period (net ffffffff81e97300)

[   25.692374] tg3 0000:01:00.0 eno1: Link is up at 1000 Mbps, full duplex

[   25.692375] tg3 0000:01:00.0 eno1: Flow control is on for TX and on for RX

[   25.692377] tg3 0000:01:00.0 eno1: EEE is disabled

[   25.692385] IPv6: ADDRCONF(NETDEV_CHANGE): eno1: link becomes ready

[  194.617425] sysctl (7577): drop_caches: 3

[ 3895.797649] sysctl (9272): drop_caches: 3

[144760.174389] sysctl (30587): drop_caches: 3

[181963.307360] sysctl (21712): drop_caches: 3

[231166.875709] sysctl (30463): drop_caches: 3

[268368.193368] sysctl (15154): drop_caches: 3

[317571.625476] sysctl (16668): drop_caches: 3

[354774.806869] sysctl (20449): drop_caches: 3

[403977.327404] sysctl (23636): drop_caches: 3

[407812.601637] sysctl (32011): drop_caches: 3

[441180.516650] sysctl (19086): drop_caches: 3

[490384.759230] sysctl (32710): drop_caches: 3

[527587.902002] sysctl (11968): drop_caches: 3

[749603.101713] sysctl (3373): drop_caches: 3

[786808.483565] sysctl (8396): drop_caches: 3

[836007.122393] sysctl (16979): drop_caches: 3

[849121.623267] TCP: eno1: Driver has suspect GRO implementation, TCP performance may be compromised.

[873211.840255] sysctl (24864): drop_caches: 3

[922416.048314] sysctl (1609): drop_caches: 3

[959620.123744] sysctl (4461): drop_caches: 3

[1008820.162870] sysctl (13343): drop_caches: 3

[1046023.947944] sysctl (17424): drop_caches: 3

[1095228.200261] sysctl (26573): drop_caches: 3

[1132431.117347] sysctl (991): drop_caches: 3

[1354446.728138] sysctl (19672): drop_caches: 3

[1391648.844474] sysctl (23174): drop_caches: 3

[1440853.907795] sysctl (32477): drop_caches: 3

[1478055.507894] sysctl (2461): drop_caches: 3

[1527258.597492] sysctl (32337): drop_caches: 3

[1534574.144041] sysctl (5026): drop_caches: 3

[1564462.683953] sysctl (3161): drop_caches: 3

[1613665.975618] sysctl (1440): drop_caches: 3

[b][1621203.196342] Paper[9449]: segfault at be0522 ip 00007ff5e25a9e47 sp 00007ffef25f8d08 error 6 in libc-2.23.so[7ff5e2523000+190000]

[/b][1650867.577775] sysctl (17236): drop_caches: 3

[1700071.466414] sysctl (12667): drop_caches: 3

[1737271.321856] sysctl (19953): drop_caches: 3

[1959288.634520] sysctl (20979): drop_caches: 3

[1996492.199768] sysctl (21141): drop_caches: 3

[2045693.660818] sysctl (2352): drop_caches: 3

[2082899.636587] sysctl (4803): drop_caches: 3
```

lastlog shows normal and only users who are suppose to login. For others, it shows as "Never Logged in"

mount looks good to me, but still for your reference -

```
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)

sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)

devtmpfs on /dev type devtmpfs (rw,nosuid,size=10240k,nr_inodes=33006655,mode=755)

devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)

tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,noexec)

tmpfs on /run type tmpfs (rw,nosuid,nodev,noexec,mode=755)

/dev/sdb3 on / type xfs (rw,noatime,attr2,inode64,noquota)

mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)

fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)

selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime)

cgroup_root on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755)

openrc on /sys/fs/cgroup/openrc type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/lib64/rc/sh/cgroup-release-agent.sh,name=openrc)

cpuset on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)

cpu on /sys/fs/cgroup/cpu type cgroup (rw,nosuid,nodev,noexec,relatime,cpu)

cpuacct on /sys/fs/cgroup/cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct)

freezer on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)

/dev/sdb2 on /boot type xfs (rw,noatime,attr2,inode64,noquota)

/dev/sdb4 on /var type xfs (rw,noatime,attr2,inode64,noquota)

/dev/sdb5 on /home type xfs (rw,noatime,attr2,inode64,noquota)

binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)

rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw,relatime)

nfsd on /proc/fs/nfsd type nfsd (rw,nosuid,nodev,noexec,relatime)
```

I need to install memtest86 to make sure nothing is wrong with the mem or the mem controller.

My concern is, why would a fault with mem or hdd delete all user directory from /home location and how does the password for root and few other accounts just change with no logs or message ?

----------

## NeddySeagoon

manu_leo,

Running memtest86 from inside Linux is not useful. It needs to be run in place of the kernel, so you must boot into it.

Random memory errors can do anything, depending where they occur.

If they are in RAM that is not used, you don't notice.

If an instruction gets changed to an illegal instruction, the program will be killed with an illegal instruction exception, if it tries to execute the instruction.

If some data is is changed, anything can happen.

The difference between data and instructions is only context.  Think of the BIOS reading the MBR into RAM, during the boot process.

To the BIOS, doing the reading, it reading some data.  Once its loaded, it becomes a stream of instructions when the BIOS jumps to its entry point.

Do you have ECC RAM and is ECC enabled.

If the ECC system detects a correctable error, it is corrected and there is a message in dmesg.

If the error is not correctable, either the affected program is stopped or the kernel panics.

All 1 bit errors can be detected and corrected.

All 2 bit errors can be detected.

Errors in 3 or more bits, all bets are off.

Is the users home missing (does not appear in 

```
ls /home
```

), or is it empty.

Users can delete the contents of /home/<user>/ but not the /home/<user>  directory itself.  That has to be done by root.

----------

## manu_leo

Neddy, that is what I am planning to do - Run memtest from the kernel place after rebooting the box.

Entire home directories of multiple users are missing from /home. /home is present however all the user directories are gone.  This is what is surprising to me.

Thanks again.

----------

## NeddySeagoon

manu_leo,

What does 

```
df -h
```

 show?

-- edit --

What is on sda ?

----------

## manu_leo

Hi Neddy , this is what I have 

```
# df -h                                 

Filesystem      Size  Used Avail Use% Mounted on 

devtmpfs         10M     0   10M   0% /dev       

tmpfs           126G     0  126G   0% /dev/shm   

tmpfs           126G  2.1M  126G   1% /run       

/dev/sdb3        50G   19G   31G  39% /          

cgroup_root      10M     0   10M   0% /sys/fs/cgr

oup                                              

/dev/sdb2       494M   61M  433M  13% /boot      

/dev/sdb4        50G  255M   50G   1% /var       

/dev/sdb5       3.6T   26G  3.6T   1% /home 
```

----------

## NeddySeagoon

manu_leo,

So something on /home is using 26G, yet you say /home is empty?

Is that consistent with what

```
du -d1 -h /home
```

 says?

If not, the difference is used space that has become detached from the filesystem.

du will be slow, as it needs to traverse the entire directory tree.

----------

## szatox

```
 /dev/sdb5       3.6T   26G  3.6T   1% /home 
```

lost+found?

Maybe some scan was already launched during boot.

----------

## manu_leo

This 26Gb which you see under /home came  because I had a copy of things which got vanished and I had to copy them manually. /home has no user accounts initially and every single piece of data was deleted.

I am still not sure how to move forward from here. I have now configured key-based authentication and have configured vpn in place of ssh port-forwarding.

Please suggest if I need to check something or if anyone have faced similar issues.

Thanks.

----------

## NeddySeagoon

manu_leo,

How did you make the copy?

If you used mv in place of cp ...

Is the command still in /root/.bash_history ?

There is another long shot.

Have all your normal users log out.

Log in as root - directly. If you log in as a user then su, the next step won't work.

Using your direct root login, 

```
umount /home
```

What is in home now?

The right answer is nothing but your lost files might be there and become hidden when /home is mounted over them.

When you are finished 

```
mount /home
```

and let your users log in.

/home must not be in use for this.

----------

