The Jenkins master is getting stuck in a restart loop because it’s failing to acquire a lease on the Jenkins home directory, preventing it from starting up properly.
This typically happens when the Jenkins home directory is on a network-attached storage (NAS) or a shared filesystem that’s experiencing issues. Jenkins uses a lock file, jenkins.lock, in its home directory to ensure only one instance is running. If it can’t create or access this lock file, it assumes another instance is running and shuts down, leading to the loop.
Here are the most common reasons and how to fix them:
1. NFS Mount Stale or Unresponsive:
The NFS mount for /var/lib/jenkins (or wherever your Jenkins home is) has become stale or is no longer responding to requests. This is the most frequent culprit on Linux systems.
- Diagnosis:
On the Jenkins master node, run
sudo mount -l | grep nfs. Look for your Jenkins home directory mount. If it’s listed, try tocdinto it. If thecdcommand hangs or you get a "Stale file handle" error, the mount is bad. You can also checkdmesg | tailfor NFS-related errors. - Fix:
Unmount and remount the NFS share.
(Replacesudo umount -f /var/lib/jenkins sudo mount -a/var/lib/jenkinswith your actual Jenkins home directory path.mount -awill remount all entries in/etc/fstab.) - Why it works: This forces a fresh connection to the NFS server, clearing any stale state and allowing Jenkins to properly access its home directory and create the lock file.
2. Inode Exhaustion on the Filesystem:
The filesystem hosting the Jenkins home directory has run out of inodes, meaning it can’t create new files, including the jenkins.lock file.
- Diagnosis:
Run
df -ion the Jenkins master. Check the "IUse%" column for the filesystem where/var/lib/jenkinsresides. If it’s at 100%, this is your problem. - Fix:
This is trickier. You need to free up inodes by deleting old, small files. Often, this means finding directories with a huge number of tiny files (e.g., old build logs, temporary files).
Manually investigate and delete unnecessary files in those directories. If you can’t free enough, you might need to resize the filesystem or move Jenkins to a filesystem with more inodes.# Find directories with a large number of files find /var/lib/jenkins -type f | cut -d "/" -f 3- | sort | uniq -c | sort -nr | head -20 - Why it works: By freeing up inodes, the filesystem can now create the
jenkins.lockfile, allowing Jenkins to start.
3. Corrupted or Locked jenkins.lock File:
The jenkins.lock file itself might be corrupted, or a previous, unclean shutdown left it in a state where it’s inaccessible or appears to be held by a non-existent process.
- Diagnosis:
Check for the existence of
/var/lib/jenkins/jenkins.lock. If it’s there and Jenkins is not running, it’s likely the problem. You can also checkls -l /var/lib/jenkins/jenkins.lockto see its permissions and ownership. - Fix:
Carefully remove the lock file.
Important: Ensure no other Jenkins processes are actually running on the master before doing this. You can verify this withsudo rm /var/lib/jenkins/jenkins.lockps aux | grep jenkins. If there are no running Jenkins processes, it’s safe to delete. - Why it works: Removing the stale lock file allows Jenkins to create a fresh one upon startup, signaling that it’s the sole running instance.
4. Insufficient Disk Space:
While less common for just the lock file, if the filesystem hosting /var/lib/jenkins is completely full, Jenkins won’t be able to write the lock file or even its initial startup logs.
- Diagnosis:
Run
df -hon the Jenkins master and check the "Avail" column for the filesystem where/var/lib/jenkinsresides. If it shows 0 available, this is the issue. - Fix:
Free up disk space by deleting old build artifacts, logs, or other unnecessary files from that filesystem.
# Example: delete old build logs older than 30 days find /var/lib/jenkins/jobs/*/builds -type f -name "log" -mtime +30 -delete - Why it works: With available space, Jenkins can write the necessary lock file and any other required startup files.
5. Incorrect File Permissions or Ownership:
The Jenkins user (jenkins by default) doesn’t have the necessary read/write permissions for the Jenkins home directory or the jenkins.lock file.
- Diagnosis:
Check the ownership and permissions of
/var/lib/jenkinsand its contents.
The owner should typically be thels -ld /var/lib/jenkins ls -l /var/lib/jenkins/jenkins.lockjenkinsuser and group. - Fix:
Recursively change ownership and permissions.
(Ensuresudo chown -R jenkins:jenkins /var/lib/jenkins sudo chmod -R u+rwX,g+rwX,o-rwx /var/lib/jenkinsjenkinsis the correct user/group for your installation.) - Why it works: Correct permissions ensure the Jenkins process can create and manage the
jenkins.lockfile and write to other necessary locations within its home directory.
6. SELinux or AppArmor Restrictions: Security modules like SELinux or AppArmor might be preventing Jenkins from writing to its home directory or creating the lock file.
- Diagnosis:
Check system logs for SELinux or AppArmor denials related to Jenkins or its home directory.
- SELinux:
sudo ausearch -m avc -ts recentorsudo grep jenkins /var/log/audit/audit.log - AppArmor:
sudo dmesg | grep -i apparmoror check/var/log/syslogfor AppArmor denials.
- SELinux:
- Fix:
- SELinux: Temporarily set SELinux to permissive mode (
sudo setenforce 0) and try starting Jenkins. If it works, you need to create or adjust SELinux policies. A common fix might involvesudo chcon -Rt svirt_sandbox_file_t /var/lib/jenkins(though this is a generic example, specific policies are better). - AppArmor: You might need to edit the AppArmor profile for Jenkins (e.g.,
/etc/apparmor.d/usr.sbin.jenkins) to allow writes to the home directory, then reload the profile (sudo systemctl reload apparmor).
- SELinux: Temporarily set SELinux to permissive mode (
- Why it works: Adjusting the security policy allows Jenkins to perform the necessary file operations it needs for startup.
After resolving one of these issues, Jenkins should be able to acquire the lock, start up cleanly, and you’ll then be ready to tackle any build failures caused by missing plugins.