Deadlock Detector and Solver (DDS)

Deadlock is among the most complex problems affecting the reliability of programs containing multiple, asynchronous threads. When undetected, deadlocks can lead to permanent thread blockage. Current detection methods are typically based on timeout and rollback of computations, resulting in significant delays. This paper presents Deadlock Detector and Solver (DDS), which can quickly detect and resolve circular deadlocks in Java programs. DDS uses a supervisory controller, which monitors program execution and automatically detects deadlocks resulting from hold-and-wait cycles on monitor locks. When a deadlock is detected, DDS uses a preemptive strategy to break the deadlock. Based on our experiments, DDS can in fact resolve deadlocks without significant run-time overhead.


INTRODUCTION
The onset of multicore hardware is fueling a trend toward concurrent software systems.With increasing frequency, software developers are using such language constructs as Java threads in order to take advantage of multicore hardware capabilities [14].
When multiple threads share the same data structures, object synchronization is necessary to avoid data races.Data races occur if multiple threads access simultaneously the same structure while at least one such access modifies the shared structure.A well-known disadvantage of object locking is that it can cause deadlocks, for instance, when a hold-and-wait cycle occurs involving locked objects.Deadlocks are notoriously difficult to detect through testing because they tend to manifest themselves randomly.
Current Java tools for detecting deadlocks are either unable to resolve deadlocks [5,[7][8][9]16], or suffer from performance degradation [7,12].Our Deadlock Detector and Solver (DDS) is a run-time monitoring toolkit that detects and resolves deadlocks involving Java monitor locks without the need for code annotations and with modest performance overhead.This is in contrast with the existing Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.Copyrights for components of this work owned by others than ACM must be honored.Abstracting with credit is permitted.To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.Request permissions from permissions@acm.org.systems [11][12][13].DDS does not force a deterministic code execution in a way similar to UnDead [17].Other authors have used lock graphs and resource graphs to monitor the state of running program and to detect deadlocks [3-6, 9, 10, 15].We also use lock graphs; however, we use lock graphs not only for detecting deadlocks, but also for resolving the deadlocks.Better yet, DDS only imposes an average overhead of about 5% for real-world Java applications as measured in our empirical studies.
This paper is organized into two sections; the first section presents the deadlock detection mechanism underlying DDS; the second section evaluates empirically the performance of DDS.It is worth noting that the evaluation is done on the basis of the various benchmarks for both deadlocked and non-deadlocked Java programs.We use an existing library called the Java Virtual Machine Toolset Interface (JVMTI) to monitor the execution of a Java program by the Java Virtual Machine (JVM).JVMTI notifies a DDS component called the Deadlock Detector Agent when two kinds of events occur.The first event, monitor_contended_enter, occurs when a Java thread tries to acquire a monitor lock, for instance, in order to access the object associated with that lock.The second event, monitor_contended_entered, occurs when a thread actually obtains the lock.DDS registers suitable callbacks for these two events.

OVERVIEW OF DDS TOOLKIT
When the events occur, our DDS callbacks are actually invoked by the JVMTI.The two callbacks allow us to create and maintain a resource mapping graph.Vertexes V in the graph represent locks held by threads.Edges E represent threads waiting for a lock held The implementation of DDS uses the Java Native Interface (JNI) to "call out" from Java code to external functions and to "call in" from external functions into Java code.Using the calling-in capability of the JNI our agent issues a function call to a thread holding a lock involved in a circular hold-and-wait.We specifically call the wait method forcing the receiving thread to release one lock in the hold-and-wait cycle.This thread will be awaken and given back the lock after the lock is again available.As of now, we chose the "victim" thread randomly; however, in the future we plan to use static analysis to identify threads whose locks can be safely removed.In particular, we seek to identify program locations were the acquisition of a second lock by a thread occurs after structures involved in a first lock acquisition were accessed and modified.In that case, the thread is not a good candidate for victim.

EMPIRICAL RESULTS
The goal of the DDS is to detect any deadlock that may have occurred in the system during runtime and to resolve it effectively.The agent's effectiveness is measured on the basis of (1) its functional accuracy and (2) the overhead that was imposed on the monitored system.All the experiments reported below were carried out on a Linux Ubuntu 16.04 LTS machine with 2.20 GHz Intel Core i7 processor and 6 GB RAM.So far, the agent has shown to accurately detect and resolve all deadlocks that occurred in our benchmarks.
The performance of the toolkit was evaluated on two different sets of experiments.The first set was based on two versions of dining philosophers program.The second set contained benchmarks obtained from two different sources.First, we used the benchmarks Moldyn, Raytracer, and Montecarlo from the Java Grande suite [2].Three additional benchmarks were obtained from ETH Zurich [1], namely, Sor, Hedc, and Elevator.We ran each benchmark thirty times; the average run times and elapsed times with and without DDS are reported in Table 1.To double check the accuracy of our measurements, we independently measured the run time of the DDS agent as well.A non-deadlocking alternative of the philosophers program was used as the control program to determine the raw CPU time and also the elapsed time for each program.In reference to the results obtained, we observed an average increase of 2.7% in the CPU time when a non-deadlocking program was run with the agent supervision.The increase in elapsed time was measured at 10.9%.Moreover, we noted a linear relation between DDS's overhead and the number of philosophers.The numbers indicate that our approach is actually quite scalable.
In addition to our control experiment, we also chose six different benchmarks to calculate the DDS overhead on real-world applications.DDS was evaluated using two input sizes as illustrated in Table 1.Different thread numbers were used in the evaluation, except for the Hedc benchmark which used a single input size.As illustrated in Table 2, the overhead in the elapsed time ranged from 0.2% up to 8%.Further, the CPU time overhead for both input sizes and the number of threads was observed to be between 0.4% and 7% except for Hedc benchmark which was as high as 16%.

CONCLUSIONS
Our results indicate that the run-time monitoring approach underlying DDS has considerable potential for addressing the problem of deadlocks in Java programs.DDS's average performance overhead was quite modest, 5.6% overall, which indicated that DDS does not have adverse effects on program efficiency.The absence of a deadlocks does not result in significant slowdown in the execution of the monitored Java program.Furthermore, this approach can be extended to other programming languages that use object locking for synchronization.In the future, we plan to combine our approach with static analysis in order to ensure data consistency in the thread whose lock is forcibly removed.

Figure 1
Figure 1 is a schematic flow diagram showing how DDS works.We use an existing library called the Java Virtual Machine Toolset Interface (JVMTI) to monitor the execution of a Java program by the Java Virtual Machine (JVM).JVMTI notifies a DDS component called the Deadlock Detector Agent when two kinds of events occur.The first event, monitor_contended_enter, occurs when a Java thread tries to acquire a monitor lock, for instance, in order to access the object associated with that lock.The second event, monitor_contended_entered, occurs when a thread actually obtains the lock.DDS registers suitable callbacks for these two events.When the events occur, our DDS callbacks are actually invoked by the JVMTI.The two callbacks allow us to create and maintain a resource mapping graph.Vertexes V in the graph represent locks held by threads.Edges E represent threads waiting for a lock held

Table 1 : Empirical results of DDS. For each benchmark, we list the line count of Java source code, the number of synchronized state- ments contained therein, and the size of input sets we used.
See Figure1.Edges are typically added to graph in the monitor_contended_enter callback.Edges are removed from the graph in the monitor_contended_entered callback.After adding an edge, we check whether a cycle was formed.In this case a deadlock may have occurred.The formation of a cycle does not always mean that a deadlock occurred.There is a delay between an event of interest happening in the JVM and the JVMTI calling DDS callbacks; the agent may have outdated information about the running Java program.Our experiments indicate that this delay is in the order of 10 to 20 milliseconds; however, this delay does not affect the validity of our analysis because deadlock is persistent.Once a deadlock occurs, it will be hold until the agent resolves it.For this reason, we recheck the conditions of a circular hold-andwait pattern after an additional delay of 20 ms.After verifying that a deadlock has in fact occurred, we use our Deadlock Solver Agent to remove the deadlock.Our algorithm for cycle detection uses a variant of depth first search to search the graph in O(V+E) time.