A cloud-supported coverage control scheme is proposed for multi-agent, persistent surveillance missions. This approach decouples assignment from motion planning operations in a modular framework. Coverage assignments and surveillance parameters are managed on the cloud and transmitted to mobile agents via unplanned and asynchronous exchanges. These updates promote load-balancing, while also allowing effective pairing with typical path planners. Namely, when paired with a planner satisfying mild assumptions, the scheme ensures that (i) coverage regions remain connected and collectively cover the environment, (ii) regions may go uncovered only over bounded intervals, (iii) collisions (sensing overlaps) are avoided, and (iv) for time-invariant event likelihoods, a Pareto optimal configuration is produced in finite time. The scheme is illustrated in simulated missions.
Introduction
Cloud-Supported Multi-Agent Surveillance.
Autonomous sensors are used in many modern surveillance missions, including search and rescue [1], environmental monitoring [2], and military reconnaissance [3]. Such missions often require agents to periodically exchange data with a central cloud (repository) and, when operating in nonideal environments or under hardware limitations, these potentially sporadic exchanges may be the only means of sharing real-time information across agents. For example, autonomous underwater vehicles are virtually isolated due to difficulties in underwater data transfer and rely on periodic surfacing to communicate with a tower [4]. Other applications that may have this constraint include data mules that periodically visit ground robots [5] and supervisory missions that require unmanned vehicles to send data to a remotely located operator [6]. Such scenarios require robust and flexible frameworks for real-time autonomous coordination.
Single-agent surveillance strategies range from simple a priori tour construction [7] to more complex methods involving Markov chains [8], optimization [9], or Fourier analysis [10]. However, it is not straightforward to generalize single-agent approaches for multi-agent missions: Naive approaches where each agent follows an independent policy often result in poor performance and introduce collision risks, while other generalizations may require joint optimizations that are intractable for even modestly sized problems [11]. Distributed control can sometimes alleviate scaling issues; however, such setups typically rely on ideal peer-to-peer data transfer making them ill-posed in many cloud-based setups.
In contrast, decomposition-based approaches to multi-agent surveillance, which decouple the assignment and routing problem by dividing the agent workspace, are popular in practice as they offer a straightforward, modular framework to reasonably accomplish surveillance goals, despite sacrificing optimality. However, in cloud-based architectures that rely solely on agent–cloud exchanges for real-time data transfer, implementation of such an approach is not straightforward using existing approaches to dynamic workspace decomposition. Indeed, in this case, updated mission information is only relayed to one agent at a time, rendering traditional partitioning schemes, which rely on complete or pairwise coverage updates, impossible, and existing strategies that utilize single-agent updates, e.g., see Ref. [12], may introduce undesirable configurations or collision-risks (Fig. 1).
This work extends existing literature by introducing a cloud-supported, decomposition-based framework for multi-agent persistent surveillance that promotes effective coverage without introducing collision (redundant sensing) risks and without requiring ideal or preplanned data exchanges. As such, the proposed framework allows agents to effectively respond to changes in the environment (spatial event likelihood) without having to be collected and redeployed due to the inability to transmit data to all the agents simultaneously. That is, when mission goals (captured by the event likelihood) change after agents are already in the field, the cloud incrementally relays the updated information through sporadic exchanges with individual agents, driving them to a new coverage configuration without introducing undesirable intermediate states. This framework also allows the cloud to act as a data fusion center if necessary, dynamically combining local sensory information collected by individual agents.
Related Literature.
Multi-agent coverage control problems have generated extensive research. Typical strategies involve optimization [13], auctions [14], metaheuristics [15], potential fields [16], Markov decision processes [17], among others [18]. Of particular relevance is multi-agent persistent surveillance (monitoring), in which a sensor team is tasked with continual surveillance of some region, requiring subregions to be visited multiple (or infinitely many) times to minimize a cost, e.g., the time between visits or the likelihood of detecting events [11]. Persistent surveillance is a generalization of patrolling, where agents follow closed tours to protect or supervise an environment. Most current solutions to patrolling problems utilize operations research, nonlearning multi-agent systems, and multi-agent learning [19]; however, formulations are often one-dimensional and solutions usually reduce to “back and forth” motions that do not readily extend to general scenarios, e.g., see Ref. [20].
The framework herein employs a type of workspace decomposition in constructing solutions to the multi-agent problem. In the context of general robotic applications, the term workspace decomposition refers to any number of strategies that are employed in order to represent a complex workspace or solution space in a simpler, often lower-dimensional form. For robotic motion planning involving obstacles, this often involves the representation of the free configuration space in a graphical or modular form that captures essential connectivity properties, e.g., typical roadmap and cellular decomposition planning methods take this approach [21,22]. In multi-agent applications, workspace decomposition methods can also be used to reduce the complexity of a problem through the assignment of subtasks to individual agents, i.e., for task assignment [23]. This is the approach taken herein to reduce the multi-agent problem into a set of single-agent problems. Since the assignment problem is often difficult to solve (it has been shown to be NP-hard, i.e., nondeterministic polynomial-time hard, in some domains [23]), multi-agent planning solutions based on this type of assignment usually lead to suboptimal solutions. Despite this drawback, assignment-based approaches remain popular in practice due to their simplicity and scalability [11]. Outside of workspace decomposition, other strategies for reducing multi-agent planning problems into a set of single-agent problems include naive approaches, where each agent independently determines its own actions by locally solving a complete problem over the full workspace, and, provided sufficiently reliable sensing and data-transfer capabilities, distributed approaches, in which agents each solve a subproblem based on information shared over a communication network [24].
For planar persistent surveillance, decomposition-based approaches typically consist of two primary components: partitioning and single-agent routing. The most common approaches to optimal partitioning in convex environments are based on Voronoi partitions [25], and effective schemes exist for constructing centroidal Voronoi, equitable, or other types of optimal partitions under communication, sensing, and workload constraints [26–28]. Nonconvex workspaces are typically addressed by representing the environment as a graph, on which a number of graph partitioning schemes can be used [29]. In robotics, discrete partitioning is often considered under communication constraints, e.g., pairwise gossip [30] or asynchronous one-to-base station communication [12]. Our proposed scheme most closely mirrors [12], in which agents communicate sporadically with a base station; however, our approach employs additional logic to ensure that the resultant coverage regions retain properties that are consistent with effective, decomposition-based surveillance.
Single-agent path planners for persistent surveillance typically operate on graphs [31,32], and classical problems, e.g., the traveling salesperson problem [33], often play a key role. Stochasticity can be introduced using probabilistic tools, e.g., Markov chains [8]. Schemes for nondiscrete spaces (open subsets of Euclidean space) are less common. Here, strategies include a priori construction of motion routines [34], adaptation of static coverage strategies [35], the use of random fields [36], and spectral decomposition [10]. The modular framework herein incorporates any single-agent planner satisfying mild assumptions (see Sec. 5).
Remarkably, few papers explicitly consider the implications of combining dynamic partitioning with continuous routing for multi-agent persistent surveillance. Existing research is mostly preliminary, considering ideal data transfer and simplistic methods. Araujo et al. [37] employed a sweeping algorithm for partitioning and guided vehicle motion via lawn-mower patterns, while Nigam and Kroo [38] used rectangular partitions and a reactive routing policy which, in ideal cases, reduces to spiral search patterns. Maza and Ollero [39] used slightly more sophisticated partitioning in tandem with lawn-mower trajectories. In Ref. [40], partitions are based on the statistical expectation of target presence, but ideal communication is assumed. Others, e.g., see Ref. [41], employed decomposition-based structures, but focused on task-assignment without detailed treatment of the combined assignment/routing protocol.
Our work uses a cloud-supported computational framework. Cloud-based robotic infrastructures (cloud robotics) have generated growing research interest, as they can provide many benefits to complex systems, such as the storage and analysis of “big data,” the availability of parallel grid computing, the potential for collective learning, and the utilization of human computation [42]. In multi-agent systems, cloud-supported schemes have been used for a variety of tasks, including collective optimization [43], rendezvous [44], and coordinated task-assignment [12]. Our use of the cloud-based architecture is primarily motivated by supervisory systems involving unmanned vehicles. Here, cloud-based architectures arise naturally, since mobile agents are required to transmit sensor data to a remotely located human operator for analysis (thus requiring a central repository), and harsh operational environments often prohibit reliance on peer-to-peer communication. However, the proposed framework is suitable for use in any similar setup and can also run in parallel with other cloud-supported operations (e.g., data analysis, learning, etc.) within complex missions.
Contributions.
This work develops a cloud-supported, decomposition-based, multi-agent coverage control framework for persistent surveillance, which relies only on sporadic, unscheduled exchanges between agents and a central cloud for data transfer. In particular, we develop a sophisticated partitioning and coordination scheme that can be effectively paired with single-agent trajectory planners. This naturally leads to the complete, modular framework in which high-level coverage is coordinated on the cloud and agent trajectories are generated independently via on-board planners. We encompass realistic constraints including restrictive communication, dynamic environments, and nonparametric event likelihoods.
Specifically, our dynamic partitioning scheme only requires agents to sporadically upload and download data from the cloud. The cloud runs updates to govern region assignments, while also manipulating high-level surveillance parameters. We prove that this update structure has many desirable properties: Coverage regions collectively form a connected m-covering and evolve at a time-scale that allows for appropriate agent reaction, no subregion remains uncovered indefinitely, local likelihood functions have disjoint support, among others. For certain cases, we show that the set of coverage regions and associated generators converges to a Pareto optimal pair in finite time. We show that the combination of our partitioning scheme with a generic trajectory planner ensures collision (sensing overlap) avoidance, provided the planner obeys natural restrictions. Finally, we illustrate our framework through numerical examples.
Our partitioning scheme is primarily motivated by Patel et al. [12]; however, the algorithms herein are explicitly designed to operate within a multi-agent surveillance framework and introduce additional logic parameters to evoke a set of desirable geometric and temporal properties. Our proposed scheme has the following advantages: First, our framework considers a modified cost function that uses subgraph distances to maintain connectivity of intermediate coverage regions, ensuring that agents can visit their entire assigned region without entering another agent's territory. Second, timing parameters are manipulated to provide inherent collision (redundant sensing) avoidance when the scheme is paired with low-level motion planners (Fig. 2). Third, our algorithms explicitly manipulate local likelihood functions maintained by the agents to guarantee that each has support within an allowable region, promoting seamless and modular pairing with any trajectory planner that uses the support of the event likelihood to govern agent routes, e.g., see Ref. [10]. The framework has these features while simultaneously maintaining similar convergence properties as the algorithms in Ref. [12].
For clarity and readability in what follows, we have presented all the theorem proofs in the Appendix.
Mission Overview and Solution Approach
A team of m mobile agents,2 each equipped with an on-board sensor, is tasked with persistently monitoring a nontrivial, planar region of interest. The primary goal of the mission is to collect sensor data about some dynamic event or characteristic, e.g., an intruder. Collected data are periodically uploaded to the cloud. Agents must move within the environment to obtain complete sensory information. Ideally, agent motion should be coordinated so that:
- (1)
The sensing workload is balanced across agents.
- (2)
No subregion goes unobserved indefinitely.
- (3)
Agents never collide (have sensor overlap).
- (4)
The search is biased toward regions of greater interest.
To achieve these goals, we propose a decomposition-based approach in which each agent's motion is restricted to lie within a dynamically assigned coverage region. The partitioning component (operating on the cloud) defines these coverage regions and provides high-level restrictions on agent motion through the manipulation of surveillance parameters, while the trajectory planning component (operating on-board each agent) incrementally constructs agent motion paths. We assume only asynchronous, cloud–agent data transfer, i.e., agents sporadically exchange data with the cloud, where interexchange times are not specified a priori, but are subject to an upper bound.
Broadly, our framework operates as follows (Fig. 2): Initial coverage variables are communicated to the agents prior to deployment, i.e., relevant initial information is known to each agent at the mission onset. Once in the field, agents communicate sporadically with the cloud. During each agent–cloud exchange, the cloud calculates a new coverage region solely for the communicating agent, along with a set of timing and surveillance parameters that serve to govern the agent's high-level motion behavior, and transmits the update. The update algorithm also alters relevant global variables maintained on the cloud. Once the update completes, the data-link is terminated and the agent follows the trajectory found via its on-board planner. Note that this structure is a type of event-triggered control, since high-level updates only occur in the event of an agent–cloud exchange.
Problem Setup
The cloud, as well as each agent, has its own local processor. “Global” information is stored on the cloud, while each agent only stores information pertinent to itself.
Convention 1. Subscripts i, j, or ℓ denote an entity or set element relevant to agent i, j, or ℓ, respectively. The superscript A denotes an entity that is stored by the agent's local processor.
A storage summary is shown in Table 1. We expand on these and define other relevant mathematical constructs here.
Agent Dynamics.
Each agent (sensor) i is a point mass that moves with maximum possible speed si > 0. Define . Note that this setup allows heterogeneity with respect to speed, i.e., agents can travel at different speeds.
Communication Protocol.
Each agent periodically exchanges data with the cloud. Assume the following:
- (1)
Each agent can identify itself to the cloud and transmit/receive data.
- (2)
There is a lower bound on the time between any two successive exchanges involving the cloud.
- (3)
There is an upper bound on the time between any single agent's successive exchanges with the cloud.
Assume that agent–cloud exchanges occur instantaneously, and note that condition 2 implies no two exchanges (involving any agents) occur simultaneously.3 Since computation time is typically small in comparison to interexchange times and exchanges occur sporadically, these assumptions are without significant loss of generality.
Environment.
Consider a bounded surveillance environment as a finite grid of disjoint, nonempty, simply connected subregions. We represent the grid as a weighted graph , where Q is the set of vertices (each representative of a unique grid element), and E is the edge set comprised of undirected, weighted edges {k1, k2} spanning vertices representing adjacent4 grid elements. Let the weight associated to {k1, k2} be some finite upper bound on the minimum travel distance between any point in the grid element associated k1 to any point in the grid element associated to k2 along a path that does not leave the union of the two elements. Locations of environmental obstacles and prohibited areas are known and are not included in the graphical representation G(Q).
Consider . A vertex k1 ∈ Q is adjacent to if and there exists {k1, k2} ∈ E with . Define as the subgraph of G(Q) induced by the vertex set . A path on between is a sequence (k1, k2,…, kn), where and for r ∈ {1,…, n − 1}. We say is connected if a path exists in between any . Let be the standard distance on , i.e., the length of a shortest weighted path in (if none exists, takes value ∞). Note that for any . With slight abuse of notation, we also let denote the map , where is the length of a shortest weighted path in between k and any vertex in .
Coverage Regions.
An m-covering of Q is a family satisfying (i) , and (ii) for all i. Define Covm(Q) as the set of all the possible m-coverings of Q. An m-partition of Q is an m-covering that also satisfies (iii) for all i ≠ j. An m-covering or m-partition P is connected if each Pi is connected. In what follows, the cloud maintains an m-covering P of Q, and surveillance responsibilities are assigned by pairing each agent i with Pi ∈ P (called agent i's coverage region). Each agent maintains a copy of Pi. The cloud also stores a set (ci is the generator of Pi), and each agent i maintains a copy of ci.
Identifiers, Timers, and Auxiliary Variables.
The proposed algorithms introduce logic and timing variables to ensure an effective overall framework. To each k ∈ Q, assign an identifier. Define , and let , where . Note is an m-partition of Q. For each agent i, define a timer Ti having dynamics if Ti ≠ 0, and otherwise. Define . Each agent i maintains a local timing variable . Even though plays a similar role to Ti, note that is constant unless explicitly updated, while Ti has autonomous dynamics. Next, the cloud maintains a set , where ωi is the time of agent i's most recent exchange with the cloud. Each agent maintains a copy of ωi. Finally, each agent stores a subset , which collects vertices that have recently been added to .
Likelihood Functions.
The likelihood of a relevant event occurring within any subset of the surveillance region is maintained on the cloud in the form of a time-varying probability mass function5. For simplicity, assume that, at any t, the instantaneous support, , equals Q.
The conditions defining are understood as follows: At some time t, an element k ∈ Q only belongs to if (i) and (ii) sufficient time has passed since k was first added to , as determined by the parameters , and . In general, each will be different6 from Φ.
Remark1 (Global Data). If global knowledge of Φ is not available instantaneously to agent i, can alternatively be defined by replacing Φ(k, t) in Eq. (1) by . All the subsequent theorems hold under this alternative definition.
Remark2 (Data Storage). The cost of storing a graph as an adjacency list is . The generator set c, each element of P, and the identifier set are stored as integral vectors. The timer set T and the likelihood Φ are stored as time-varying real vectors, while the set ω is stored as a (static) real vector. Thus, the cost of storage on the cloud is . Similarly, each agent's local storage cost is .
Dynamic Coverage Update Scheme
Adopt the following convention for the remaining analysis.
Convention 2. Suppose that:
- (1)
.
- (2)
Given a specific time instant, superscripts “−”or “+” refer to a value before and after the instant in question, respectively.
Additive Set.
We start with a definition.
Definition 1 (Additive Set).Given, the additive setis the largest connected subset satisfying:
- (1)
.
- (2)
For any , where j ≠ i:
- (a)
Tj = 0.
- (b)
.
- (a)
The following characterizes well-posedness of Definition 1.
Proposition 1 (Well-Posedness). Ifis connected and disjoint from, thenexists and is unique for any.
Proof. With the specified conditions, is connected and satisfies 1 and 2 in Definition 1; is the unique, maximally connected superset of satisfying 1 and 2.
Under the conditions of Proposition 1, if , then and there is a path from k to h in that is shorter than the optimal path spanning cj and h within G(Pj), for any j ≠ i with h ∈ Pj.
Cloud-Supported Coverage Update.
If (i) each agent is solely responsible for events within its coverage region, and (ii) events occur proportionally to Φ, then is understood as the expected time required for an agent to reach a randomly occurring event from its region generator at time t; related functions are studied in Refs. [12,26], and [30]. Algorithm 1 defines the operations performed on the cloud when agent i makes contact at time t0. Here, the input ΔH > 0 is a constant parameter.7 Recall from Sec. 3.3 that, given a vertex k and sets , the value represents the minimum time required to travel from k to any node in the set . Therefore, the operations in lines 1 and 2 of Algorithm 2 are implicitly max–min operations that, intuitively, define upper bounds on the time required for the agents to vacate areas that have shifted as a result of the update. Additional remarks to aid in reader understanding are given by the comments within the algorithms (italicized text preceded by a “%” character).
Consider the following initialization assumptions.
Assumption 1 (Initialization). The following properties are satisfied when t = 0:
- (1)
P is a connected m-partition of Q.
- (2)
.
- (3)
For all i∈{1,…, m}
(a) .
(b) .
(c) .
(d) .
(e) .
- (3)
Notes 1 and 3b together imply that ci ≠ cj for any j ≠ i. Our first result guarantees well-posedness of Algorithm 1.
Theorem 1 (Well-Posedness). Under Assumption 1, a scheme in which, during each agent–cloud exchange, the cloud executes Algorithm 1 to update relevant global and local variables is well-posed. That is, operations required by Algorithm 1 are well-posed at the time of execution.
Algorithm 1 does not ensure that coverage regions (elements of P) remain disjoint. It does, however, guarantee that the m-covering P, the local coverage regions , and the local likelihoods retain properties that are consistent with a decomposition-based scheme. Namely, the coverings P and PA maintain connectivity, and each has support that is disjoint from that of all other local likelihoods, yet evolves to provide reasonable global coverage. Further, Algorithm 2 ensures that agents can “safely” vacate areas that are reassigned before newly assigned agents enter. We expand upon these ideas in the following Secs. 4.3 and 4.4.
Set Properties.
The next result formalizes key set properties.
Theorem 2 (Set Properties). Suppose Assumption 1 holds, and that, during each agent–cloud exchange, the cloud executes Algorithm 1 to update relevant global and local variables. Then, the followings hold at any time t ≥ 0:
- (1)
is a connected m-partition of Q.
- (2)
P is a connected m-covering of Q.
- (3)
ci∈Pi and ci ≠ cj for any i ≠ j.
- (4)
for any i.
- (5)
.
When the cloud makes additions to an agent's coverage region, newly added vertices are not immediately included in the instantaneous support of the agent's local likelihood. If agent movement is restricted to lie within this support, the delay temporarily prohibits exploration of newly added regions, allowing time for other agents to vacate. Conversely, when regions are removed from an agent's coverage region, Algorithm 1 ensures that a “safe” path, i.e., a path with no collision risk, exists and persists long enough for the agent to vacate. Let , and define agent i's prohibited region, , as the set of vertices not belonging to the support of , i.e., . We formalize this discussion here.
Theorem 3 (Coverage Quality). Suppose Assumption 1 holds, and that, during each agent–cloud exchange, the cloud updates relevant global and local coverage variables via Algorithm 1. Then, for any k∈Q and any t ≥ 0:
- (1)
k belongs to at least one agent's coverage region Pi.
- (2)
Iffor some i, then there exists t0 satisfyingsuch that, for all, the vertex k belongs to the set.
- (3)If k is removed from Pi at time t, then, for allwe have
- (a)
.
- (b)
There exists a length-minimizing path onfrom k into, and all of the vertices along any such path (except the terminal vertex) belong to the set.
- (a)
Theorems 2 and 3 allow Algorithm 1 to operate within a decomposition-based framework to provide reasonable coverage with inherent collision avoidance. Indeed, if agents avoid prohibited regions, the theorems imply that each agent (i) can visit its entire coverage region (connectedness), (ii) allows adequate time for other agents to vacate reassigned regions, and (iii) has a safe route into the remaining coverage region if its current location is removed during an update.
Remark3 (Coverage Variables). If Assumption 1 holds and updates are performed with Algorithm 1, then and for all i and all t. Thus, both Theorems 2 and 3 are equivalently stated by replacing Pi with and ci with in their respective theorem statement.
Remark4 (Bounds). Theorem 3 holds if is replaced by any upper bound on the time required for any arbitrary agent to travel between two arbitrary vertices within an arbitrary connected subgraph of G.
Convergence Properties.
Our proposed strategy differs from Ref. [12] due to logic, i.e., timing parameters, etc., that ensures effective operation within a decomposition-based framework. Note also that differs from previous partitioning cost functions in Refs. [12,26], and [30], since it uses subgraph, rather than global graph, distances. As such, convergence properties of the algorithms herein do not follow readily from existing results. Consider the following definition.
Definition 2 (Pareto Optimality). The pair (c, P) is Pareto optimal at time t if
- (1)
for any.
- (2)
for any.
When Φ is time-invariant (and Assumption 1 holds), Algorithm 1 produces finite-time convergence of coverage regions and generators to a Pareto optimal pair. The limiting coverage regions are “optimal” in that they balance the sensing load in a way that directly considers the event likelihood. Further, the operation only requires sporadic and unplanned agent–cloud exchanges. We formalize this result here.
Theorem 4 (Convergence). Suppose Assumption 1 holds and that, during each agent–cloud exchange, the cloud updates relevant global and local coverage variables via Algorithm 1. If Φ is time-invariant, i.e.,for all t1, t2, then the m-covering P and the generators c converge in finite time to an m-partition P* and a set c*, respectively. The pair (c*, P*) is Pareto optimal at any time following convergence.
Remark 5 (Weighted Voronoi Partitions). It can be shown that Pareto optimality of (c*, P*) in Theorem 4 implies that, following convergence, P* is a multiplicatively weighted Voronoi partition (generated by c*, weighted by s, subject to density Φ(⋅, t)) by standard definitions (e.g., see Ref. [12]). If the centroid set of each Pi is defined as , then P* is also centroidal.
Our coverage scheme balances the sensing load in that it updates coverage responsibilities in a way that locally minimizes the expected time required for an appropriate agent to travel from its region generator to a randomly occurring event within the environment. In essence, this serves to avoid unreasonable configurations, e.g., configurations where one agent is assigned responsibility of all the important areas and remaining agents are only given unimportant regions. Further, the update rules consider agent speeds, so faster agents will generally be assigned larger (weighted by the likelihood) coverage regions than slower agents. Similar strategies are employed in traditional load-balancing algorithms that are based on Voronoi partitions and operate over environments with stochastic event likelihoods, e.g., see Refs. [12,26], and [30].
Theorem 4 and Remark 5 provide a rigorous characterization of the type of load-balancing provided in the static likelihood case. A few comments are in order. First, Pareto optimal pairs (and multiplicatively weighted, centroidal Voronoi partitions) are nonunique. Theorem 4 only guarantees that one possible Pareto optimal pair will be found in the static likelihood case and does not exclude the existence of a lower-cost configuration. Second, the coverage configurations produced by our algorithms may not be equitable, i.e., the probability of events in each coverage region may vary across agents, even in the static case. The development of a strategy that produces equitable partitions within a similar cloud-supported architecture is an open problem and an interesting area of future research.
Decomposition-Based Surveillance
This section pairs the proposed partitioning framework with a generic, single-vehicle trajectory planner, forming the complete, multi-agent surveillance framework.
Complete Routing Algorithm.
By Theorem 2, the support of each (i) lies entirely within the coverage region and (ii) is disjoint from the support of other local likelihoods. By Theorem 3, (i) any vertex can only go uncovered over bounded intervals, and (ii) the parameter ΔH is a lower bound on the time that a recently uncovered vertex must remain covered before it can become uncovered again. These results suggest that an intelligent routing scheme that carefully restricts motion according to the instantaneous support of the local likelihood functions could achieve adequate coverage while maintaining collision avoidance. This motivates the following assumption.
Assumption 2 (Agent Motion). Each agent i has knowledge of its position at any time t, and its on-board trajectory planner operates under the following guidelines:
- (1)
Generated trajectories obey agent motion constraints.
- (2)
Trajectories are constructed incrementally and can be altered in real-time.
- (3)
The agent is never directed to enter regions associated with.
Each agent precisely traverses generated trajectories.
Note that condition 3 of Assumption 2 implies that the agent is never directed to leave regions associated with . Algorithm 3 presents the local protocol for agent i. Here, the on-board trajectory planner is used to continually update agent trajectories as the mission progresses (line 1). As such, the low-level characteristics of each individual agent's motion (i.e., the relation between the underlying likelihood function, the coverage configuration, and the resultant trajectory) depend on the particular planner employed.
Collision Avoidance.
Although Assumption 2 locally prevents agents from entering prohibited regions, dynamic coverage updates can still place an agent within its prohibited region if the vertex corresponding to its location is abruptly removed during an update. If this happens, Algorithm 3 constructs a route from the agent's location back into a region where there is no collision risk. With mild assumptions, this construction: (i) is well-defined and (ii) does not present a transient collision risk. We formalize this result here.
Theorem 5 (Collision Avoidance). Suppose Assumptions 1 and 2 hold, and that each agent's initial position lies within its initial coverage region Pi. If each agent's motion is locally governed according to Algorithm 3, where the update in line 4 is calculated by the cloud via Algorithm 1, then no two agents will ever collide.
Remark6 (Agent Dynamics). We assume point mass dynamics for simplicity. However, all the theorems herein also apply under alternative models, e.g., nonholonomic dynamics, provided that the surveillance environment is discretized so that: (i) travel between adjacent grid elements is possible without leaving their union, (ii) agents can traverse the aforementioned paths at maximum speed, and (iii) edge weights accurately upper bound travel between adjacent regions. Typically, these conditions can be met by choosing discretization cells that are sufficiently large. For example, under a Dubins vehicle model, choosing square cells whose edge lengths are at least twice the minimum turning radius of any of the vehicles is sufficient. If these conditions are not met, Theorems 1, 2, 3, and 4 still apply, though Theorem 5 is no longer guaranteed since more sophisticated logic would be needed to ensure that agent trajectories remain within the allowable regions. The development of an algorithmic extension that would guarantee collision avoidance for general nonholonomic vehicles is not straightforward and is left as a topic of future research.
In practice, however, we note that even if Theorem 5 is not satisfied in a strict sense, implementation of the algorithms herein within a decomposition-based scheme will usually still provide a significantly reduced collision or redundant sensing risk, provided that vehicles remain close to the allowable regions specified by the updates.
Numerical Examples
This section presents numerical examples to illustrate the proposed framework's utility. In all the examples, updates are performed on the cloud via Algorithm 1 during each agent–cloud exchange, and each agent's local processor runs the motion protocol in Algorithm 3. For incremental trajectory construction (Algorithm 3—line 1), all the examples use a modified spectral multiscale coverage (SMC) scheme [10], which creates trajectories to mimic ergodic dynamics while also locally constraining motion to lie outside of prohibited regions. Note this planner satisfies Assumption 2. Initial region generators were selected randomly (enforcing noncoincidence), and each agent was initially placed at its region generator. The initial covering P was created by calculating a weighted Voronoi partition, and remaining initial parameters were chosen to satisfy Assumption 1. It is assumed that relevant initial variables are uploaded to the agents' local servers prior to initial deployment, i.e., each agent has full knowledge of relevant initial information at time 0. For each simulation, randomly chosen agents sporadically exchanged data with the cloud. Agent–cloud exchange times were randomly chosen, subject to a maximum interexchange time .
Time-Invariant Likelihood.
Consider a four-agent mission, executed over a 100 × 100 surveillance region that is subject to a time-invariant, Gaussian likelihood centered near the bottom left corner. The region is divided into 400, 5 × 5 subregions. Regions are considered adjacent if they share a horizontal or vertical edge. Here, each agent had a maximum speed of one unit distance per unit time, and time units. Figure 3 shows the evolution of the coverage regions for an example simulation run. Agent trajectories are shown by the solid lines. Note that Fig. 3 only shows each agent i's active coverage region, i.e., . The family of active coverage regions does not generally form an m-covering of Q; however, elements of this family are connected and never intersect as a result of inherent collision avoidance properties.
The left plot in Fig. 4 depicts the maximum amount of time that any grid square went uncovered, i.e., the grid square did not belong to any agent's active covering, during each of 50 simulation runs. Here, the maximum amount of time that any region went uncovered was 186 units, though the maximum for most trials was less than 75 units. This is well-below the loose bound predicted by Theorem 3 (see Remark 4). Note that this metric does not capture the time between the agent's actual visits to the grid square, only the length of intervals on which no agent was allowed to visit the square. The time between visits is governed by the particular choice of trajectory planner and the parameter ΔH.
The right plot in Fig. 4 shows the mean values of the cost function , calculated over the same 50 simulations runs. Here, error bars represent the range of cost values at select points. The variance between runs is due to the stochastic nature of the agent–cloud exchange patterns. Note the cost is nonincreasing over time, eventually settling as the coverage regions/generators reach their limiting configuration, e.g., see Fig. 3. These configurations are each Pareto optimal and form a multiplicatively weighted Voronoi partition (Remark 5). The resultant coverage assignments provide load-balancing that takes into account the event likelihood. If the low-level trajectory planner biases trajectories according to the event likelihood, this results in desirable coverage properties. Under the modified SMC planner used here, the temporal distribution of agent locations closely resembles the spatial likelihood distribution in the limit, as shown in Fig. 5.
Further, during the simulation, no two agents ever occupied the same space due to the careful parameter manipulations employed by Algorithm 1. Figure 6 illustrates the logic governing these manipulations through a simplistic example: During the first update, the left agent acquires some of the right agent's coverage region. Rather than immediately adding these regions to its active covering, the left region waits until sufficient time has passed to guarantee that the right agent has updated and moved out of the reassigned regions. Under Algorithm 3, once the right agent communicates with the cloud, it immediately vacates the reassigned regions, after which the left agent can add the region to its active covering. This procedure guarantees that no two agents will never have overlapping active coverings and thus never collide (Theorem 5). This same logic results in inherent collision prevention over more complex scenarios.
We can also compare the coverage regions produced by Algorithm 1 to those produced by the partitioning algorithm in Ref. [12]. The two algorithms were simulated in parallel, performing updates with the same randomly chosen agent–cloud exchange orderings across the two conditions. The left and the right plots in Fig. 7 show the mean coverage cost over 50 simulation runs, calculated using (defined in Ref. [12], Sec. II-C) and (Sec. 4.2), respectively (portions of the curves extending above the axes indicate an infinite value). The function is defined nearly identically to , but uses global graph distances, rather than subgraph distances. It is clear that the evolution produced by the algorithm in Ref. [12] converges to a final configuration slightly faster than that produced by Algorithm 1 whenever costs are quantified using . However, when costs are calculated using , the algorithms in Ref. [12] produced intermediate configurations with infinite cost, indicating disconnected regions, while Algorithm 1 maintained connectivity. In contrast to Ref. [12], our surveillance framework allows for complete coverage without requiring the agents to leave their assigned regions, allowing it to operate more effectively within a multi-agent surveillance scheme.
Time-Varying Likelihood.
We now illustrate how the proposed coverage framework reacts to changes in the underlying likelihood. Specifically, we study a particular type of time-varying likelihood in which the spatial distribution only changes at discrete time-points, i.e., Φ(k, ⋅) is piecewise constant for any k ∈ Q. This type of scenario is common in realistic missions, e.g., when the cloud's estimate of the global likelihood is only reformulated if some agent's sensor data indicate a drastic change in the underlying landscape. For this purpose, we adopt identical parameters as in the first example, with the exception of the likelihood Φ, whose spatial distribution abruptly switches at select time-points. If the switches are sufficiently spaced in comparison to the rate of convergence, then the coverage regions dynamically adjust to an optimal configuration that is reflective of the current state. For example, Fig. 8 shows the coverage region evolution after the underlying likelihood undergoes a single switch between the likelihoods in Fig. 9 at time t = 2000. In contrast, if the likelihood changes faster than the rate of convergence, coverage regions are constantly in a transient state. Despite this, the proposed framework still provides some degree of load-balancing. To illustrate, the left plot in Fig. 10 shows the value of during a simulation in which the underlying likelihood switches at 12 randomly chosen time-points over a 1000 unit horizon. Each switch redefined the spatial likelihood as a Gaussian distribution centered at a randomly selected location. Note that the cost monotonically decreases between the abrupt spikes caused by changes in the underlying likelihood. A convergent state is never reached; however, coverage regions quickly shift away from high-cost configurations, as seen in the right plot in Fig. 10, which shows the average percentage drop in the value of the cost as a function of the number of nontrivial updates, i.e., updates that did not execute Algorithm 1—line 3, following an abrupt switch in the likelihood. The percentage drop is calculated with respect to the cost immediately following the most recent switch. During the first nontrivial update, the cost drops on average 21.8% of the initial postswitch value, indicating a quick shift away from high-cost configurations.
Conclusion
This work develops a cloud-supported, decomposition-based, coverage control framework for multi-agent surveillance. In particular, a dynamic partitioning strategy balances the surveillance load across available agents, requiring only sporadic and unplanned agent–cloud exchanges. The partitioning update algorithm also manages high-level logic parameters to guarantee that the resulting coverage assignments have geometric and temporal properties that are amenable to combination with generic single-vehicle trajectory planners. In certain cases, the proposed algorithms produce a Pareto optimal configuration, while ensuring collision avoidance throughout.
Future work should further relax communication assumptions to reflect additional limitations, e.g., use of directional antennae for wireless transmission. Extensions to the proposed algorithms to incorporate explicit area constraints on coverage regions, as well as more general vehicle dynamics, should also be explored. Other areas of future research include the combination of peer-to-peer and cloud-based communication, performance comparisons between specific trajectory planners when used within our framework, e.g., those involving ergodic Markov chains, and further theoretical performance characterization.
Acknowledgment
This work has been sponsored by the U.S. Army Research Office and the Regents of the University of California, through Contract No. W911NF-09-D-0001 for the Institute for Collaborative Biotechnologies.
Appendix: Proofs
Proposition 2 (Sets). Suppose Assumption 1 holds, and that, at the time of each exchange occurring prior to a fixed time, required algorithmic constructions are well-posed so that the cloud can perform updates via Algorithm 1. Then, for any k∈Q and any time:
- (1)
.
- (2)
k belongs to at most two elements of P.
- (3)
if, thenfor any ℓ ≠ IDk.
- (4)
if k∈Pj, j ≠ IDk, thenfor.
Proof. Fix . When t = 0, is an m-partition of Q, implying the proposition. Since k is not removed from or added to any Pi with i ≠ IDk until its first reassignment, i.e., when IDk is changed, the proposition holds for all t prior to the first reassignment. Suppose the proposition holds for all t prior to the pth reassignment, which occurs at time t0. Suppose . Algorithm 1 defines . Thus, and remains in these sets until another reassignment. Thus, statement 1 holds for all t prior to the p + 1st reassignment. Now note that, by Algorithm 2, reassignment cannot occur at t0 unless . By inductive assumption, statement 3 of the proposition holds when , implying for any ℓ ≠ j. Upon reassignment, the timers Tj, Ti are modified such that . Since (i) IDk cannot change when Tj > 0, and (ii) agent j exchanges data with the cloud and removes k from Pj prior to time , we deduce that k solely belongs to Pj, Pi until the p + 1st reassignment. Further, for any at which Ti = 0 and the p + 1st reassignment has not yet occurred, k ∈ Pi exclusively (addition to other sets in P without reassignment is impossible). We deduce statements 2 and 3 for any t prior to the p + 1st reassignment. Finally, considering Algorithm 2, it is straightforward to show that implies (Tj = 0 only if the most recent exchange that manipulated elements of involved agent j, after which ). Further, (i) no agent claims vertices from unless Tj = 0, and (ii) no vertex is added to a coverage region without reassignment. As such, for any prior to another update in which some other agent claims vertices from Pj. Extending this logic and noting the bound , we deduce the same result for any t prior to the p + 1st reassignment of k. Noting once again, the proposition follows by induction.
Proof of Theorem 1. It suffices to show that Definition 1 is well-posed (Proposition 1) whenever additive sets are required. We proceed by induction. When t = 0, is a connected m-partition of Q; thus, for any i, . The same holds prior to the first agent–cloud exchange, so the first call to Algorithm 1 is well-posed. Now assume that, for all t prior to the pth call to Algorithm 1, (i) is a connected m-partition of Q, and (ii) if an exchange occurs that requires construction of (assume this also applies to the impending pth exchange), then immediately prior to the exchange. This implies that Proposition 2 holds at any time t prior to the p + 1st exchange. Assume the pth exchange occurs at time t0 and involves agent i. Recall that is an m-partition of Q. To show is connected, first notice . Since either (connected by Definition 1) or (connected by assumption), connectivity of follows. Now consider . If , then is connected. Suppose and is not connected. By Proposition 2, for any . Thus, there exists such that (i) , and (ii) any optimal path in spanning k1 and contains some . Select one such path and vertex k2. Without loss of generality, assume {k1, k2} ∈ E. Definition 1 implies and thus . Since and , Proposition 2 implies , contradicting . Thus, is connected. Invoking Proposition 2—statement 3, the inductive assumption holds for all t prior to the p + 1st exchange, implying well-posedness.
Proof of Theorem 2.
Statement 1: The proof of Theorem 1 implies the statement.
Statement 2: P is an m-covering of Q since is an m-partition of Q, and for any i (Proposition 2—statement 1). The covering P is connected, since (connected by statement 1) immediately following any agent–cloud exchange and is unchanged in between updates.
Statement 3: It suffices to show for any t, i: this would imply ci ≠ cj for any i ≠ j, and ci ∈ Pi (Proposition 2). Since for all i at t = 0, the same holds for any t prior to the first agent–cloud exchange. Suppose for all i (thus ci ≠ cj for any i ≠ j) prior to the pth exchange. If agent i is the pth communicating agent, lines 2 and 9 of Algorithm 1 imply . Since for any j, we have . Thus, , and induction proves the statement.
Statements 4 and 5: Statement 4 follows from Eq. (1), noting that . Statement 5 holds by assumption when t = 0. Let k ∈ Q, and consider times when IDk changes (k is reassigned). Since for any j at t = 0, statement 4 implies that, for any t prior to the first reassignment, k belongs exclusively to . Suppose statement 5 holds for all t prior to the pth reassignment (occurring at time t0), and . Then, and k belongs exclusively to when (Proposition 2). By Algorithms 1 and 2, and . Since is unchanging over an interval of length at least , (1) implies when . Since k is reassigned when t = t0, and . Agent j will communicate with the cloud at some time . Thus, Ti > 0 when t = t1, and k is removed from both Pj and . Thus, for all and before the p + 1st reassignment, k belongs exclusively to .
Proof of Theorem 3. Theorem 2 implies statement 1.
Statement 2: For any i, (i) Ti = 0 when t = 0, and (ii) for any connected . Thus, it is straightforward to show, for any i, t, we have the bound . We show by induction that, for any i, t, the bound also holds: Ti = 0 and when t = 0, so , and the bound holds prior to the first cloud–agent exchange involving any agent, since at any such time. Assume the bound holds prior to the pth exchange (occurring at t = t0). Consider two cases: if agent i is the communicating agent, then ; if not, then and either (i) implying the desired bound, or (ii) and . This logic extends to all t prior to the p + 1st exchange and the desired bound follows by induction.
Using the aforementioned two bounds, we have . Fix t and . Then, , and (+ is with respect to the fixed time t). Further, over the interval , the vertex k is not reassigned, Pi is not augmented, and is unchanged. Therefore, . Setting , we have . Since at time , k is not reassigned during the interval . Thus, over the same time interval.
implying that are constant over the interval .
Since coverage regions are connected and nonempty (Theorem 2), and for any on the interval (t0, t] (Proposition 2), (i) there exists a path of length from k into and every vertex along any such path (except the terminal vertex) lies within , and (ii) over the interval . Since (i) each vertex belongs to at most two coverage regions (Proposition 2), (ii) , and (iii) no agent claims vertices within when , vertices along the path (excluding the terminal vertex) do not belong to Pj with j ≠ IDk over . To complete the proof, Algorithm 2 implies , and thus over .
Proposition 3 (Cost). Suppose Assumption 1 holds and that, during each agent–cloud exchange, the cloud updates relevant global and local coverage variables via Algorithm 1. If for all t1, t2, then .
Proof. Since Φ is time-invariant, for any t1, t2. When t = 0, , and . The same is true prior to the first agent–cloud exchange. Suppose that, prior to the pth exchange (occurring at t = t0, involving agent i), we have . Recall that, for any j, Pj and coincide immediately following any exchange involving agent j and, if agent j claims vertices from Pi, then Algorithm 2 ensures that agent i exchanges data with the cloud before additional vertices are claimed by other agents. Considering the pth update, this logic, along with Proposition 2, implies that , for all j ≠ i. Noting that , we deduce that any contributes equivalently to and . If , then for any j ≠ i such that , we have (Definition 1), implying k contributes equivalently to and . Now suppose , where . We show that : if a length-minimizing path in between and k is also contained in , the result is trivial. Suppose that every such minimum-length path leaves . By Proposition 2, every must satisfy . Thus, assume without loss of generality that k is adjacent to . Let be a vertex that is adjacent to k and lies along a minimum-length path in spanning and k. Since , we have as constructed during the update, implying and thus . Since and , Proposition 2 implies , contradicting . Thus, , which, by inductive assumption, implies that k contributes equally to the value of both and . We conclude . Since P, , and c are static between updates, the statement follows by induction.
Proof of Theorem 4. The cost is static in between agent–cloud exchanges, as P and c are unchanged. Consider an exchange occurring at time t0 involving agent i. By Proposition 3, we have . Thus, the cost is nonincreasing in time. Since Covm(Q) is finite, there is some time t0 after which the value of is unchanging. Consider fixed t > t0 at which some agent i exchanges data with the cloud. Since the value of is unchanging, Algorithm 1 implies that and c are unchanged by the update. It follows that c and converge in finite time. Further, since for any i (Proposition 2), we have . By persistence of exchanges imposed by , this implies that after some finite time, P and are concurrent.
To prove Pareto optimality of the limiting configuration, consider t0, such that for all t > t0, (c, P) is unchanging and P is an m-partition of Q. Timers are only reset when P is altered, so assume without loss of generality that Ti = 0 for all i at any t > t0. Suppose agent i exchanges data with the cloud at time t > t0. Algorithm 1 implies that there is no k ∈ Pi such that (if not, the cost is lowered by moving ci). Similarly, for k ∈ Pj with j ≠ i that is adjacent to Pi, we have (if not, there exists , contradicting convergence). As such, for any i, there is no such that , implying statement 2 of Definition 2.
Proof of Theorem 5. By Assumption 2, each agent's local trajectory planner never directs the agent into its prohibited region, so if no agent–cloud exchange occurs that removes the vertex corresponding to the relevant agent's location from its coverage region, then the statement is immediate. Suppose, at some time t0, agent i, whose location is associated with some , exchanges data with the cloud and k is removed, i.e., . At time , agent i executes lines 5 and 6 of Algorithm 3. Theorem 3 ensures that (i) there exists a path in between k and , (ii) all vertices along the path belong to whenever , and (iii) over the same interval. Thus, if agent i immediately moves along the path, it will lie exclusively within until it reaches . It remains to show that the agent does not enter while traversing the aforementioned path. Without loss of generality, consider the update at time t0 previously described. Since k is reassigned prior to the update, we have (vertices in Pi are not claimed unless Ti = 0, implying ). By Proposition 2, we deduce , so no vertices in can belong to , and no vertex on the constructed path belongs to . Since is constant over the interval .
Each agent is uniquely paired with a coverage region, so the quantity m represents both the number of agents and the number of regions.
Mathematically, the bound also prevents Zeno behavior.
Travel between the elements without leaving their union is possible.
Φ(⋅, t) is a probability mass function for any t.
is not normalized and thus may not be a time-varying probability mass function in a strict sense.
ΔH represents, loosely, the amount of time an agent must hold a vertex before it can be reassigned. Precise characterization is in Sec. 4.3.