Investigating virtual machine file locks on ESXi (10051)
Details
- Powering on a virtual machine fails.
- Unable to power on a virtual machine.
- Adding an existing virtual machine disk (VMDK) to a virtual machine that is already powered on fails.
- You see the error:
Failed to add disk scsi0:1. Failed to power on scsi0:1
- When powering on the virtual machine, you see one of these errors:
Unable to open Swap File
Unable to access a file since it is locked
Unable to access a file <filename> since it is locked
Unable to access Virtual machine configuration
- In the
/var/log/vmkernel
log file, you see entries similar to:WARNING: World: VM xxxx: xxx: Failed to open swap file <path>: Lock was not free WARNING: World: VM xxxx: xxx: Failed to initialize swap file <path>
- When opening a console to the virtual machine, you may receive the error:
Error connecting to <path><virtual machine>.vmx because the VMX is not started
- Powering on the virtual machine results in the power on task remaining at 95% indefinitely.
- Cannot power on the virtual machine after deploying it from a template.
- The virtual machine reports conflicting power states between vCenter Server and the ESXi/ESX host console.
- Attempting to view or open the
.vmx
file using a text editor (for example,cat or
vi
), reports an error similar to:cat: can't open '[name of vm].vmx': Invalid argument
The purpose of file locking
Note: For more information on finding the lock owners after upgrading to VMware ESXi 5.5 Patch 5, see Finding the lock owners of a VMDK or file on a VMFS datastore in VMware ESXi 5.5 P05 (2110152).
To prevent concurrent changes to critical virtual machine files and file systems, ESXi/ESX hosts establish locks on these files. In certain circumstances, these locks may not be released when the virtual machine is powered off. The files cannot be accessed by the servers while locked, and the virtual machine is unable to power on.
These virtual machine files are locked during runtime:
VMNAME.vswp
DISKNAME-flat.vmdk
DISKNAME-ITERATION-delta.vmdk
VMNAME.vmx
VMNAME.vmxf
vmware.log
Initial quick test
To run your critical virtual machine running:
- Migrate the virtual machine to the host and attempt to power on.
- If unsuccessful, continue to attempt a power on of the virtual machine on other hosts in the cluster.When you hit the host holding the file locks, the virtual machine should power on as the file locks in place are valid.
- If you still cannot power on the virtual machine continue with the steps below to investigate in more detail.
To identify and release the lock on the files, perform these relevant steps for your version of ESXi.
ESXi troubleshooting steps
Identifying the locked file
To identify the locked file, attempt to power on the virtual machine. During the power on process, an error may display or be written to the virtual machine’s logs. The error and the log entry identify the virtual machine and files:
- Where applicable, open and connect the vSphere or VMware Infrastructure (VI) Client to the respective ESXi host, VirtualCenter Server, or the vCenter Server host name or IP address.
- Locate the affected virtual machine, and attempt to power it on.
- Open a remote console window for the virtual machine.
- If the virtual machine is unable to power on, an error on the remote console screen displays with the name of the affected file.Note: If an error does not display, proceed to these steps to review the
vmware.log
file of the virtual machine:
- Log in as
root
to the ESXi host using an SSH client. - Confirm that the virtual machine is registered on the server and obtain the full path to the virtual machine by running this command:
# vim-cmd vmsvc/getallvms
The output returns a list of the virtual machines registered to the ESXi host. Each line contains the datastore and location within virtual machine’s.vmx
file.
You see output similar to:
[datastore] VMDIR/VMNAME.vmx
Verify that the affected virtual machine appears in this list. If it is not listed, the virtual machine is not registered on this ESXi host. The host on which the virtual machine is registered typically holds the lock. Ensure that you are connected to the proper host before proceeding.
- Move to the virtual machine’s directory:
# cd /vmfs/volumes/datastore/VMDIR
- Use a text viewer to read the contents of the
vmware.log
file. At the end of the file, look for error messages that identify the affected file.
- Log in as
Locating the lock and removing it
A virtual machine can be moved between hosts, because of this the host where the virtual machine is currently registered may not be the host maintaining the file lock. The lock must be released by the ESX/ESXi host that owns the lock. This host is identified by the MAC address of the primary management vmkernel interface.Note: Locked files can also be caused by backup programs keeping a lock on the file while backing up the virtual machine. If there are any issues with the backup, it may result in the lock not being removed correctly. In some cases, you may need to disable your backup application or reboot the backup server to clear the hung backup.This lock can be maintained by the VMkernel for any hosts connected to the same storage.Start by identifying the server whose VMkernel may be locking the file.To identify the server:
- Report the MAC address of the lock holder by running this command (except on an NFS volume):
# vmkfstools -D /vmfs/volumes/UUID/VMDIR/LOCKEDFILE.xxx
Note: Run this command on all commonly locked virtual machine files (as listed at the start of the Solution section) to ensure that all locked files are identified.
- For servers prior to ESXi 4.1, this command writes the output of the command above to the system’s logs. From ESXi 4.1, the output is also displayed on-screen. Included in this output is the MAC address of any host that is locking the
.vmdk
file. To locate this information, check/var/log/messages
.Look for lines similar to:Hostname vmkernel: 17:00:38:46.977 cpu1:1033)Lock [type 10c00001 offset 13058048 v 20, hb offset 3499520
Hostname vmkernel: gen 532, mode 1, owner xxxxxxxx-xxxxxxxx-xxxx- xxxxxxxxxxxx mtime xxxxxxxxxx]
Hostname vmkernel: 17:00:38:46.977 cpu1:1033)Addr <4, 136, 2>, gen 19, links 1, type reg, flags 0x0, uid 0, gid 0, mode 600
Hostname vmkernel: 17:00:38:46.977 cpu1:1033)len 297795584, nb 142 tbz 0, zla 1, bs 2097152
Hostname vmkernel: 17:00:38:46.977 cpu1:1033)FS3: 132: <END supp167-w2k3-VC-a3112729.vswp>
The second line (in bold) displays the MAC address after the word
owner
. In this example, the MAC address of the management vmkernel interface of the offending ESXi host is xx:xx:xx:xx:xx:xx. After logging in to the server, the process maintaining the lock can be analyzed.
In versions of ESXi equal or greater than 4.0 U3 and 4.1 U1, there is a new field which can identify a read only or multi writer lock owner.
You see an output similar to:
[root@test-esx1 testvm]# vmkfstools -D test-000008-delta.vmdk
Lock [type 10c00001 offset 45842432 v 33232, hb offset 4116480
gen 2397, mode 2, owner 00000000-00000000-0000-000000000000mtime 5436998]<—–<span style=”box-sizing:border-box;font-size:10pt;”–<-</span–<<span style=”box-sizing:border-box;font-size:10pt;”–<-–MAC address of lock owner
RO Owner[0] HB offset 3293184 xxxxxxxx-xxxxxxxx-xxx-xxxxxxxxxxxx <———————–<span style=”box-sizing:border-box;font-size:10pt;”–<-<span style=”box-sizing:border-box;font-size:10pt;”–<-–MAC address of read-only lock owner
Addr <4, 80, 160>, gen 33179, links 1, type reg, flags 0, uid 0, gid 0, mode 100600
len 738242560, nb 353 tbz 0, cow 0, zla 3, bs 2097152
- If the command vmkfstools -D test-000008-delta.vmdk does not return a a valid MAC address in the top field (returns all zeros ). Review the field below it, the RO Owner line below it to see which MAC address owns the read only/multi writer lock on the file. In the preceding example, the offending MAC address is: xx:xx:xx:xx:xx:xx.
- In some cases, it is possible that it is a NFS lock or a lock generated by another system or product that can use or read VMFS file systems. The file is locked by a VMkernel child or cartel world and the offending host running the process/world must be rebooted to clear it.
- After you have identified the host or backup tool (machine that owns the MAC) locking the file, power it off or stop the responsible service, then restart the management agents on the host running the virtual machine to release the lock.
- To determine if the MAC address corresponds to the host that you are currently logged in to, see Identifying the ESX Service Console MAC address (1001167). If it does not, you must establish a console or SSH connection to each host that has access to this virtual machine’s files.
- When you have identified the host holding the lock, unregister the virtual machine from the host.Note: If you cannot find the virtual machine in the host inventory in vCenter Server, open a vSphere or VI Client connection direct to the ESXi host. Check for any entry in the inventory labelled Unknown VM. If found, remove the unknown virtual machine from the inventory.
- When successfully removed from the inventory, register the virtual machine on the host holding the lock and attempting to power it on. You may have to set DRS to manual ensuring the virtual machine powers up on the correct host.If the virtual machine still does not power on, complete these procedures while logged into the offending host.
Note: If you have already identified a VMkernel lock on the file, skip the rest of the section and go to the Further troubleshooting steps section in this article.
- Check if the virtual machine still has a World ID assigned to it:For ESXi 4.x, run these commands on all ESXi hosts:
# cd /tmp # vm-support -x
You see output similar to:
Available worlds to debug: wid=world_id name_of_VM_with_locked_file
On the ESXi 4.x host where the virtual machine is still running, kill the virtual machine. This releases the lock on the file. To kill the virtual machine, run this command:
# vm-support -X world_id
Where theworld_id
is the World ID of the virtual machine with the locked file.
Note: This command takes 5-10 minutes to complete. Answer No to Can I include a screenshot of the VM, and answer Yes to all subsequent questions.
After stopping the process, you can power on the virtual machine or access the file/resource.
For ESXi 5.x and later, the
esxcli
command-line utility can be used locally or remotely to display a list of the virtual machines which are currently running on the host.
Obtain a list of all running virtual machines, identified by their World ID, Cartel ID, display name, and path to the
.vmx
configuration file using this command:
# esxcli vm process list
You see output similar to:
VirtualMachineName World ID: 1268395 Process ID: 0 VMX Cartel ID: 1264298 UUID: ab cd ef ... Display Name: VirtualMachineName Config File: /path/VirtualMachineName.vmx
Two worlds are listed. The first world number (in this example,1268395
) is the Virtual Machine Monitor (VMM) for vCPU 0. The second world number (in this example,1264298
) is the virtual machine Cartel ID.
On the ESXi 5.x and later host where the virtual machine is still running, kill the virtual machine. This releases the lock on the file. To kill the virtual machine, run this command:
# esxcli vm process kill --type=soft --world-id=1268395
For additional information, see Mapping a virtual machine world number to a virtual machine name (1001101).
- In ESXi 4.1/5.x/6.x, to find the owner of the locked file of a virtual machine, run this command:
# vmkvsitools lsof | grep Virtual_Machine_Name
You see output similar to:
11773 vmx 12 46 /vmfs/volumes/Datastore_Name/VirtualMachineName/ VirtualMachineName-flat.vmdk
You can then run this command to obtain the PID of the process for the virtual machine:
ps | grep Virtual_Machine_name
You can kill the process with this command:
kill -9 PID
To generate a core dump after killing the running virtual machine (but hung and nonresponsive), use the commandkill -6 PID
orkill -11 PID
.
Note: In ESXi 4.1 ESXi 5.x and ESXi 6.x, you can use the
k
command inesxtop
to send a signal to and kill a running virtual machine process. On the ESXi console, enter Tech Support mode and log in asroot
. For more information, see Tech Support Mode for Emergency Support (1003677).
- Run the
esxtop
utility using theesxtop
command. - Press c to switch to the CPU resource utilization screen.
- Press Shift+f to display the list of fields.
- Press c to add the column for the Leader World ID.
- Identify the target virtual machine by its Name and Leader World ID (LWID).
- Press k.
- In the World to kill prompt, type in the Leader World ID from step 6 and press Enter.
- Wait 30 seconds and validate that the process is no longer listed.
- Run the
Determining if the file is being used by a running virtual machine
If the file is being accessed by a running virtual machine, the lock cannot be usurped or removed. It is possible that the host holding the lock is running the virtual machine and has become unresponsive, or another running virtual machine has the disk incorrectly added to its configuration prior to power-on attempts.To determine if the virtual machine processes are running:
- Determine if the virtual machine is registered on the host, run this command as the
root
user:
Note: The output lists the# vim-cmd vmsvc/getallvms
vmid
for each virtual machine registered. Record this information as it is required in the remainder of this process on the ESXi server.
- Assess the virtual machine’s current state on the host, run this command:
# vim-cmd vmsvc/power.getstate vmid
- To stop the virtual machine process, see Powering off an unresponsive virtual machine on an ESX host (1004340).
Further troubleshooting steps
Ensure that virtual disk is not locked by a backup appliance
The virtual disk may be locked by a backup appliance. First, ensure that backups are not in progress. Unmount the virtual disk from the Edit Settings of the backup appliance either through the VMware vSphere Client or the vSphere Web Client. For more information, see Unable to delete the virtual machine snapshot due to locked files (2017072).
Using the touch utility to determine if the file can be locked
The touch
utility is designed to update the access and modification time stamp of the specified file or directory. As such, the command can be used to test the file and directory locking mechanism in the VMFS filesystem, where the procedure is expected to fail on locked files. Using touch
is the preferred method because the changes to the resource are minimal, although other commands can be used, such as head -c 0 .
Note: You must run the touch command on each host in the cluster. The touch command will succeed on the host that holds the lock on the file and fail on every other host. Then you can investigate what process on that host is locking the file.
To test the file or directory locking functionality, run this command:
# touch filename
Note: Performing a touch *
command performs the operation on all files in the current directory.
The touch *
command can result in these outcomes:
- If the
touch *
command succeeds, then the command successfully made changes to the date/time stamp and has verified that the file can and has been locked (then unlocked). At this point, retry the virtual machine power-on operation to see if it succeeds. - If the
touch *
command fails with adevice or resource busy
message, it indicates that a process is maintaining a lock on the file or directory. This may be on any of the ESXi/ESX hosts that have access to the file, except the host holding the lock. If the message is reported, proceed to the next section. - If another error message is reported, it may indicate that the metadata pertaining to file or directory locking on VMFS may not be valid or corrupt. If this is the case, collect diagnostic information from the ESXi/ESX host and submit a Support Request. For more information, see Collecting diagnostic information for VMware products (1008524) and How to Submit a Support Request.
Removing the .lck file (NFS only)
The files on the virtual machine may be locked via NFS storage. You can identify this by files denoted with .lck-####
(where ####
is the value of the fileid field returned from a GETATTR request for the file being locked) at the end of the file name. This is an NFS file lock and is only listed when using the ls -la
command because it is hidden file (in versions of ESX/ESXi prior to 5.x). For more information, see Powering on a virtual machine on NFS or trying to remove an NFS Datastore fails with errors “Unable to access a file since it is locked” or “Resource is in use” (1012685).
For more information on NFS file locking, see the VMware NFS Best Practices Whitepaper.
Caution: These can be removed safely only if the virtual machine is not running.
Note: VMFS volumes do not have .lck
files. The locking mechanism for VMFS volumes is handled within VMFS metadata on the volume.
Determining if the .vmdk file is in use by other virtual machines
A lock on the .vmdk
file can prevent a virtual machine from starting. However, since virtual machine disk files can be configured for use with any virtual machine, the file may be locked by another virtual machine that is currently running.
To determine if the virtual machine’s disk file is configured for use on more than one virtual machine, run this command:
# egrep -i DISKNAME.vmdk /vmfs/volumes/*/*/*.vmx
Notes:
- This command attempts to locate the specified disk name among all
.vmx
configuration files for the virtual machines that are visible to the ESX/ESXi host. ADevice or resource busy
message is printed for each virtual machine that is running but not registered to this ESX host. You must run this command on each ESX/ESXi host in the infrastructure or specifically on hosts that have access to the storage containing the virtual machine’s files. - If any additional virtual machines are configured to use the disk, determine if they are currently running. Powering off the other virtual machine using the disk file releases the lock. You must determine which virtual machine should have ownership of the file, then reconfigure your virtual machines to prevent this error from occurring again.
- As part of their operation, many virtual machine backup solutions temporarily attach the virtual machine’s
.vmdk
files to themselves. In such cases, if the backup fails and/or the host shuts down, the backup virtual machine may still have another virtual machine’s.vmdk
file(s) attached. If that is the case, the other virtual machine is usually powered on first, which then creates a locked file condition when the backup virtual machine is attempted to be powered on. Check using Edit Settings on your backup solution’s virtual machine to see if it has a disk attached that belongs to a different virtual machine. If it does, power down the backup virtual machine, select the appropriate disk and choose Remove to remove the disk from the virtual machine.Warning: Do not delete files from disk.
If the .vmdk
file is not in use by other virtual machines, confirm that there are no VMkernel processes locking the file, per the preceding section, Locating the file lock and removing it. If a host can be determined but the specific offending VMkernel child process ID cannot be identified, you must reboot the server to clear the lock.
Note: You can also try to migrate the virtual machine to another host and power it on. If that ESX host has the lock for the virtual machine, it should allow you to power it on.
Rebooting the ESX/ESXi host which is locking the files
By this stage, you have already investigated for identifiable VMkernel processes which have maintained locks upon the required files, however an unidentified child process still maintains the lock. You have identified the server via the vmkfstools -D
command in earlier steps, the lsof
utility (ESX only) yields no offending processes, and no other virtual machines are locking the file.
The server should be restarted to allow the virtual machine to be powered on again.
Note: Collect diagnostic information prior to rebooting if you want to investigate the issue further through analysis by VMware Technical Support.
Migrate the virtual machines from the server and restart it using these steps:
- Migrate or vMotion all virtual machines from the host to alternate hosts.
- When the virtual machines have been evacuated, place the host into maintenance mode and reboot it.
Note: If you have only one ESX/ESXi host or do not have the ability to perform vMotion or migrate virtual machines, you must schedule downtime for the affected virtual machines prior to rebooting. When the host has rebooted, start the affected virtual machine again.
Check the integrity of the virtual machine configuration file (.vmx)For more information on checking the integrity of the virtual machine configuration file, see Verifying ESX/ESXi virtual machine file integrity (1003743).Note: If a virtual machine does not power on, it may be pointing to two disks in the .vmx
file. Remove one of the disks from the virtual machine and attempt to power on again.Note: For related information, see Cannot power on a virtual machine because the virtual disk cannot be opened (1004232).
Opening a Support RequestIf the problem persists after completing the steps in this article:
- Gather diagnostic information. For more information, see Collecting diagnostic information for VMware products (1008524).
- File a support request with VMware Support and note this Knowledge Base article ID (10051) in the problem description. For more information, see How to Submit a Support Request.
Additional Information
For translated versions of this article, see:
- Español: La máquina virtual no funciona por bloqueo o por falta de archivos (1032063)
- 日本語: ESXi/ESX での仮想マシン ファイル ロックの調査 (1033280)
- Portuguese: Máquina virtual não liga devido a arquivos bloqueados (2032408)
- 简体中文: 调查 ESXi/ESX 上的虚拟机文件锁定 (2081803)
- Deutsch: Analysieren von VM-Dateisperren in ESXi/ESX (2145086)
Update History
03/05/2010 – Added note to restart the Management agents before restarting the ESX host.11/04/2010 – Added lsof steps.12/15/2011 – Added link to KB article 100374307/23/2012 – Added additional symptom08/24/2012 – Added not regarding .vmx09/13/2012 – Added additional symptom
This Article Replaces
1004925
Request a Product Feature
To request a new product feature or to provide feedback on a VMware product, please visit the Request a Product Feature page