Fast and Adaptive Multi-Agent Planning under Collaborative Temporal Logic Tasks via Poset Products

Fast and Adaptive Multi-Agent Planning under Collaborative Temporal Logic Tasks via Poset Products

PDF

Zesen Liu¹, Meng Guo¹, Weimin Bao², Zhongkui Li¹^,^*

Research. Vol 7 Article ID 0337

Less

Research. Vol 7 Article ID 0337

• Research Article •

Fast and Adaptive Multi-Agent Planning under Collaborative Temporal Logic Tasks via Poset Products

Full

Zesen Liu¹, Meng Guo¹, Weimin Bao², Zhongkui Li¹^,^*

Affiliations

¹Department of Mechanics and Engineering Science, College of Engineering, Peking University, Beijing 100871, China.

² Science and Technology Commission of China Aerospace Science and Technology Corporation, Beijing 100048, China.

Published: 2024-03-20 doi: 10.34133/research.0337

Outline

Abstract

Less

Efficient coordination and planning is essential for large-scale multi-agent systems that collaborate in a shared dynamic environment. Heuristic search methods or learning-based approaches often lack the guarantee on correctness and performance. Moreover, when the collaborative tasks contain both spatial and temporal requirements, e.g., as linear temporal logic (LTL) formulas, formal methods provide a verifiable framework for task planning. However, since the planning complexity grows exponentially with the number of agents and the length of the task formula, existing studies are mostly limited to small artificial cases. To address this issue, a new planning paradigm is proposed in this work for system-wide temporal task formulas that are released online and continually. It avoids two common bottlenecks in the traditional methods, i.e., (a) the direct translation of the complete task formula to the associated Büchi automaton and (b) the synchronized product between the Büchi automaton and the transition models of all agents. Instead, an adaptive planning algorithm is proposed, which computes the product of relaxed partially ordered sets (R-posets) on-the-fly and assigns these subtasks to the agents subject to the ordering constraints. It is shown that the first valid plan can be derived with a polynomial time and memory complexity with respect to the system size and the formula length. Our method can take into account task formulas with a length of more than 400 and a fleet with more than 400 agents, while most existing methods fail at the formula length of 25 within a reasonable duration. The proposed method is validated on large fleets of service robots in both simulation and hardware experiments.

Cite this Article

Zesen Liu, Meng Guo, Weimin Bao, Zhongkui Li. Fast and Adaptive Multi-Agent Planning under Collaborative Temporal Logic Tasks via Poset Products[J]. Research, 2024 , 7 (3) : 0337 . DOI: 10.34133/research.0337

Full Text

Less

Introduction

Less

Recent advances in computation, perception, and communication allow the deployment of autonomous robots in large, remote, and hazardous environments, e.g., to assist service staff in hospitals [1], to maintain offshore drilling platforms [2], and to monitor and assist construction sites [3]. Furthermore, fleets of heterogeneous robots, such as unmanned ground vehicles and unmanned aerial vehicles, are deployed to accomplish tasks that are otherwise too inefficient or even infeasible for a single robot [4]. Not only the overall efficiency of the team can be largely improved by allowing the robots to move and act concurrently [5] but also the capabilities of the team can be greatly extended by enabling multiple robots to directly collaborate on a task [6]. Recent works have demonstrated such potentials preliminary for simple tasks such as collaborative exploration [7], formation control [8], object transportation [9], and pursuer–evader games [10]. The task planning problem for multi-robot systems is in general NP-hard [11], due to the inherent combinatorial nature of robot-task assignment and various constraints such as capabilities and deadlines. The standard approach is to formulate a mixed integer linear programs (MILPs) over the integer variables and constraints [12]. While being sound and optimal, these methods are applicable to only small-scale systems. Thus, extensive work can be found on designing meta-heuristic algorithms for finding sufficiently good solutions in a reasonable time, e.g., genetic algorithms [13], colony optimization [14,15], particle swarm optimization [16,17], learning-based algorithms [18–20], or large language models [21,22]. However, these methods often lack a formal guarantee on the correctness and quality of the planning results.

Moreover, to specify more complex tasks, many recent works propose to use formal languages such as linear temporal logic (LTL) [23], computation tree logic (CTL) [24], and signal temporal logic (STL) [25] as an intuitive yet powerful way to describe both spatial and temporal requirements on global [26] or local [27] behaviors. Notably, the works in [28–30] formulate MILP by a central planning unit given different system models and task constraints; the works in [31–35] instead propose various search algorithms over the state or solution space of the whole system. However, the aforementioned planning methods are often executed offline for a set of predefined static tasks. A particularly challenging scenario is when the system operates indefinitely, i.e., new tasks are released or canceled dynamically and continually by external demand [36], or certain target features related to the tasks can change location during run time [10]. This would require the fleet to adaptively change their task plans online to modify existing assignments and incorporate new tasks. Thus, the aforementioned methods become inadequate as the sequence of tasks is infinite and their specifications are unknown beforehand. Recursive application of the centralized methods in a naive way leads to not only intractable computation complexity but also inconsistent or even oscillatory assignments. Thus, an efficient and adaptive planning scheme is essential for multi-robot systems that collaborate in a dynamic environment [37–39] or an unknown environment [40].

Related work

The standard framework for planning under temporal tasks is based on the model-checking algorithm [23]: First, the task formulas are converted to a deterministic Robin automaton (DRA) or nondeterministic Büchi automaton (NBA) via off-the-shelf tools such as SPIN [41] and LTL2BA [42]. Second, a product automaton is created between the automaton of formula and the models of all agents, such as weighted finite transition systems (wFTSs) [23], Markov decision processes (MDPs) [43], or petri nets [44]. Last, certain graph search or optimization procedures are applied to find a valid and accepting plan within the product automaton, such as nested-Dijkstra search [32], integer programs [45], auction [46], or sampling-based [33,47].

Thus, the fundamental step of all aforementioned methods is to translate the task formula into the associated automaton. This translation may lead a double-exponential size w.r.t. the formula length as shown in [42]. The only exceptions are GR(1) formulas [48], of which the associated automaton can be derived in polynomial time but only for limited cases. In fact, for general LTL formulas with length more than 25, it takes more than 2 h and 13 GB memory to compute the associated NBA via LTL2BA. Although recent methods have greatly reduced the planning complexity in other aspects, the length of considered task specifications remains limited due to this translation process. For instance, the sampling-based method [33,47] avoids creating the complete product automaton via rapidly exploring random tree (RRT) sampling, of which the largest simulated case has 400 agents and the task formula has a maximum length of 14. The planning method [32] decomposes the resulting task automaton into independent subtasks for parallel execution. The simulated case scales up to 100 robots and a task formula of maximum length 18. Moreover, other existing works such as [49–52] mostly consider task formulas of length around 6 to 10. This limitation hinders its application to more complex robotic missions.

This drawback becomes even more apparent for dynamic scenes, where contingent tasks specified as LTL formulas are triggered by external observations and released to the whole team online. In such cases, most existing approaches compute the automaton associated with the new task, of which the synchronized product with the current automaton is derived, see, e.g., [51,52]. Thus, the size of the resulting automaton equals to the product of all automata associated with the contingent tasks, which is clearly a combinatorial blow-up. Consequently, the amount of contingent tasks that can be handled by the aforementioned approaches is limited to handpicked examples.

Our contribution

To overcome this curse of dimensionality in the size of tasks and agents, we propose a new paradigm that is fundamentally different from the model checking-based methods. First, for a syntactically co-safe LTL (sc-LTL) formula that is a conjunction of numerous subformulas, we calculate the R-posets of each subformula as a set of subtasks and their partial temporal constraints. Then, an efficient algorithm is proposed to compute the product of R-posets associated with each subformula. The resulting product of R-posets is complete in the sense that it retains all subtasks from each R-poset along with their partial orderings and resolves potential conflicts. Given this product, a task assignment algorithm called the time-bound contract net (TBCN) is proposed to assign collaborative subtasks to the agents, subject to the partial ordering constraints. Last but not least, the same algorithm is applied online to dynamic scenes where contingent tasks are triggered and released by online observations. It is shown formally that the proposed method has a polynomial time and memory complexity to derive the first valid plan w.r.t. the system size and formula length. Extensive large-scale simulations and experiments are conducted for a hospital environment where service robots react to online requests such as collaborative transportation, cleaning, and monitoring.

The main contribution of this work is threefold: (a) a systematic method is proposed to tackle task formulas with length more than 400, which overcomes the limitation of existing translation tools that can only process formulas of length less than 25 in reasonable time; (b) an efficient algorithm is proposed to decompose and integrate contingent tasks that are released online, which not only avoids a complete re-computation of the task automaton but also ensures a polynomial complexity to derive the first valid plan; (c) the proposed task assignment algorithm is fully compatible with both static and dynamic scenarios with interactive objects.

Problem statement

Consider a multi-agent system with heterogeneous capabilities, a series of sc-LTL task formulas that are released online, and a set of interactive objects in the dynamic environment. The objective is to generate a task plan for the system online such that these tasks are satisfied with high efficiency.

For instance, as shown in the “Numerical simulation” section, a fleet of heterogeneous service robots is deployed in the hospital environment. Different tasks such as transportation of goods or patients, cleaning, and maintenance are released online continuously. Many of such tasks contain numerous subtasks with ordering constraints that require direct collaboration of different robots. An efficient coordination algorithm is proposed such that these subtasks are assigned and fulfilled online in a timely manner.

Results

Less

In this section, we present the proposed solution briefly, where we first give a definition of task specification and introduce the core method of computing the R-poset product. Then, an efficient assignment algorithm is introduced under these posets with temporal and spatial constraints. The simulation and hardware experiment results are presented against several strong baselines for different sizes of fleets and task complexities. Technical details and derivations can be found in Materials and Methods.

Task specification and R-poset product

We briefly introduce the syntax of LTL used for task specifications. The basic ingredients of LTL formulas are a set of atomic propositions AP in addition to several Boolean and temporal operators. Atomic propositions are Boolean variables that can be either true or false. The syntax of LTL is defined as φ ≜ ⊤ ∣ p ∣ φ₁ ∨ φ₂ ∣ φ₁ ∧ φ₂ ∣ ¬ φ ∣ ◯ φ ∣ φ₁ Uφ₂ ∣ ◻ φ ∣ ⋄ φ, where ⊤ ≜ True; p ∈ AP is the alphabet; ∨ (conjunction), ∧ (disjunction), and ¬ (negation) are the logical operators; ⋄ (eventually), ◯ (next), and U (until) are the temporal operators; and ◻ (always) and ⇒ (implication) are the derived operators. A special class of LTL formula called sc-LTL [23,53] only contains the temporal operators ◯, U, and ⋄. A complete description of the semantics and syntax of LTL can be found in [23].

Given a series of sc-LTL formulas that are released online, most existing methods would combine the formulas with ∧ (conjunction) and convert them into an NBA. It results in a graph structure that consists of states, transitions, guards, inital states, and final states. However, since its size is double exponential to the length of formula φ as proven in [42], it quickly becomes intractable with more task formulas. Instead, we compute the product of R-posets associated with these formulas, called the Poset-prod (denoted by ⊗). The R-poset P = (Ω, ⪯ , ≠) is a high-level abstraction of NBA, proposed in our earlier work [54], which consists of a set of subtasks Ω and their partial relations as less equal ⪯ and conflict ≠. It has been proven therein that if the subtasks are executed under the partial relations, the resulting traces satisfy the NBA. Once a new task formula is released, it is transformed into a new R-poset P₂ and its product with the current R-poset P₁ is computed. More specifically, given P₁, P₂, the Poset-prod returns a new set of R-posets

P = P 1 ⊗ P 2

that satisfies both P₁ and P₂, which is computed by iterating the following two procedures: Task composition and Relation update. The first procedure is to create a group of subtasks as a combination between the subtasks in P₂ and the subtasks of P₁ that have not been executed. A depth-first-search (DFS) algorithm is proposed to gradually add the subtasks along with the corresponding mapping function. The second procedure is to calculate the partial relations between the subtasks such that the ordering constraints among the subtasks in the composed product are consistent without conflicts. Consequently, the overall poset is given by P_final = (Ω_f , ⪯_f, ≠_f), which is iteratively computed each time a new poset is added.

The proposed Poset-prod method outperforms the traditional method in both computational efficiency and performance. This improvement becomes even more pronounced when the number and length of subformulas increase. This is because the time of generating NBAs of φ₁, φ₂, ⋯ grows linearly, while the time of converting φ₁ ∧ φ₂ ∧ ⋯ into NBA grows exponentially. Moreover, for a new added formula, the algorithm updates the final R-poset based on the previous result, which ensures the performance for online cases. Finally, it is an anytime algorithm that the algorithm can calculate an R-poset within linear complexity for multiple subformulas at the expense of optimality. The concrete complexity analysis of Poset-pord is shown in Discussion, and the definition of R-poset and label is shown in Materials and Methods.

To give an example, consider four subformulas φ₁, φ₂, φ₃, φ₄ as shown in Fig. 1. Their NBAs

B 1, ⋯, B 4

can be transformed into four R-posets P₁, ⋯, P₄. The first round of Poset-prod is between P₁ and P₂ as P₁ ⊗ P₂ = {P_f1, P_f2}. Then, the following rounds of Poset-prod are performed between the results of the previous round and the next R-poset as P_f1 ⊗ P₃, of which the first solution is denoted by P_f3. At the expense of some optimality, the algorithm can go to next round before all results of P_f2 ⊗ P₃ are generated, as it is an anytime algorithm. Finally, we can derive P_f4 by computing P_f3 ⊗ P₄ as the final R-poset, which satisfies P₁, ⋯, P₄. Note that the time to generate the NBA associated with φ₁,

∧ i = 1 2 φ i

∧ i = 1 3 φ i

, and

∧ i = 1 4 φ i

grows significantly with a duration of 0.058 s, 0.076 s, 1.56 s, and 25.56 s via LTL2BA [42]. By contrast, the time to compute these products P₁ ⊗ P₂, P₁ ⊗ P₂ ⊗ P₃, P₁ ⊗ P₂ ⊗ P₃ ⊗ P₄ grows slower with a duration of 0.199 s, 0.279 s, and 0.385 s. Thus, our method can deal with a much larger number of tasks online. Detailed comparisons between our method and traditional methods can be found in the “Scalability analysis and comparisons” section.

Online subtask assignment under complexity constraints

Once the set of R-posets that satisfies all subformulas is generated, a series of subtasks in the R-poset should be assigned to the agents under several complexity constraints, including temporal constraints from R-poset, objects constraints, and cooperative constraints. Similar to the classical contract net method [55], we propose an efficient and suboptimal assignment algorithm called time bound contract net (TBCN). The main difference is that all these constraints are represented uniformly by time bounds in our methods, and an example is shown in Fig. 2. The final assignment is a group of timed sequence of robot actions

J = J 1, ⋯, J n

, where J_n = [(t_k, ω_k, a_k), ⋯] indicates that agent n will execute action a_k at time t_k to satisfy subtask ω_k such that the newly released tasks are fulfilled.

TBCN consists of three steps: initialization, computation of feasible subtasks, and online bidding. As the partial constraints of final R-poset P_final might be changed after computing the product, the subtasks that are conflicting with the updated partial orders are removed in initialization. Then, the last two steps are iterated: the set of subtasks whose partial orders are satisfied given the current assignment

J

in computation of feasible subtasks. Consequently, a linear program is formulated and solved in the online bidding, to choose one subtask from the feasible set. Thus, the resulting execution time and action plans are added to

J

Numerical simulation

As shown in Fig. 4, the hospital environment consists of the wards, the operating rooms, the hall, the exits, and the hallways. The multi-agent system is employed with three junior doctors, six senior doctors, and eight nurses, and four types of patients are treated as interactive objects including family visitors, vomiting patients, senior patients, and junior patients. The detailed mappings between agents, action, objects, and their labels are shown in Table S1. Furthermore, various types of tasks are considered, such as “Go the rounds of the wards and provide medicine,” “Check and record the patient status,” and “Prepare and perform an operation on a patient.” Some tasks are released online under certain conditions, for instance, when a patient vomits at a region, the task “Take the patient into ward and check his status. Meanwhile, the doctors should not enter this region until it has been cleaned.” The sc-LTL formula associated with the complete task is given by

φ = φ b 1 ∧ ⋯ φ b 6 ∧ φ VP ∧

φ JP ∧ φ SP 1 ∧ φ SP 2 ∧ φ FV

, which has a total length of 62. Detailed description of formulas are shown in Table S2.

As shown in Fig. 3, each subtask ω_i consists of the constraints before execution (in blue) and during execution (in green). The directed black arrows denote the ordering constraints, while the bidirectional red arrows for the conflicting constraints. The mapping from the subtasks within the input R-posets to the subtasks within the final R-poset is shown in brackets as

ω 1 ′ (ω 1 ′) : ω 1 ′

→ ω₁. Note that most of R-posets on the right are directly incorporated into the final R-posets on the left, such as

P b 1, ⋯, P b 6, P u 1, ⋯, P u 5

. This is due to the fact that their subtasks are independent without ordering constraints, which allows for parallel execution. Additionally, some subtasks on the left represent the same subtasks on the right. For example,

ω 1 ′

P b 2

and

ω 1 ′

P b 2

representing both ω₅, since their ordering constraints (in green) are identical. The same action can be performed to satisfy multiple subtasks on the left, which improves the overall efficiency. Furthermore, all subtasks on the right satisfy the partial relations between the corresponding subtasks on the left, while additional constraints are added if there are conflicts among the ordering constraints. For instance, action

D w 7, w 7 ∅ ∅

of ω₁₅ should not be executed before the subtask ω₁₄. Thus, an additional ordering constraint is added such that subtask ω₁₅ should be executed after ω₁₄. These properties guarantee that each subtask within the R-posets on the left can be executed, as their partial relations are satisfied. Consequently, the final R-poset satisfies all R-posets on the left. Note that the complete formula has a total length of 62, of which the NBA takes more than 1 h to compute. On the contrary, the first final R-poset is computed within 10.96 s.

The results of task assignment are shown in Fig. 4, which include the updates at 40, 180, 215, and 400 s with five additional objects added. Then, after modifying the current R-posets to accommodate these formulas, the method TBCN is executed to assign the new subtasks within the final R-poset. The trajectories of agents and objects are shown, with certain regions that are not allowed to enter. For instance, when a patient vomits at region h13 in Fig. 4C, the nurses cannot enter until other agents have cleaned this region. These constraints ensure that both their actions and their trajectories satisfy the R-poset. In addition, once a new object is added, a new formula is released and then the R-poset is updated. All subtasks satisfy the ordering constraints in the R-posets. For instances, as shown in Fig. 4D, although agents 6, 14 have arrived in region w7 before 100 s to collaborate on the subtask

σ 16 = M w 7, w 7 ∅ ∅

(in blue), they have to wait until that agent 10 has fulfilled the subtask

σ 14 = C w 7, w 7 ∅ ∅

, due to the ordering constraints that (ω₁₄, ω₁₆) ∈ ⪯ , {ω₁₄, ω₁₆} ∈ ≠. The constraints introduced by objects are also satisfied, e.g., task ω₇ (in green) cannot be executed at 380 s before subtask ω₆ (in purple) has been finished, since the required object 3 has not been transferred to region o₄. Last but not least, most tasks are executed in parallel mostly with a total completion time of 815 s, which is much shorter than 2013.5 s if the subtasks are executed sequentially.

Scalability analysis and comparisons

First, we show how the computation time of the proposed task assignment method TBCN varies under different numbers of agents and subtasks. Since the number of subtasks cannot be directly determined, we run the TBCN with a large number of formulas, of which the length ranges from 20 to 80. The number of subtasks and the associated computation time are recorded. As shown in Fig. 5, the average computation time only increases slightly as the number of agents is increased from 12 to 400, while the computation time increases considerably if the number of subtasks is increased from 22 to 34. This is due to the fact that new subtasks would introduce additional temporal constraints in the assignment procedure. The execution efficiency η from Eq. 6 decreases as the number of agent increases, while η increases slightly as the number of subtasks grows. The highest η is about 39% where most subtasks are executed in parallel with only minimum waiting time for task synchronization.

Second, to further validate the scalability of the proposed method against existing methods, we evaluate the time and memory cost to compute both the first R-poset and the complete R-posets by the proposed Poset-prod, given the task formulas of different lengths. The conversion from an LTL formula to NBA is via LTL2BA. The following three baselines are considered: (a) the direct translation from the complete formula to NBA [42], in which the complete formula is the conjunction of all subformulas; (b) the decomposition set-based algorithm [56], which decomposes the NBA into independent subtasks; (c) the sampling-based method [33,47], which generates a product automaton much smaller than the complete one by sampling the product states of NBA and weighted transition systems (WTS) via RRT. Each method is tested three times with five formulas of the same length ranging from 5 to 400. As shown in Fig. 6, both the time and memory cost increase drastically with the formula length for all methods except the proposed Poset-prod to compute the first R-poset. In particular, when the formula length exceeds 15, the decomposition algorithm runs out of memory or time. The sampling-based method cannot generate a solution when the formula length exceeds 20. Since all the baseline methods require the translation to NBA first, it becomes intractable as the formula length exceeds 25. In contrast, the proposed method of Poset-prod can generate all R-posets with a formula length of 70 and the first R-poset even when the length reaches 400, which is consistent with our analyses in Discussion.

Hardware experiment

We further tested the proposed method on hardware within a simplified hospital environment. The multi-agent system consists of 10 differential-driven mobile robots, with 3 JD in green, 3 SD in yellow, and 4 Nu in blue. The required tasks include “prepare and execute an advanced operation for patient 1 at operating room o4.” Similar to the “Numerical simulation” section, there are event-triggered tasks released online such as “when a patient vomits, take him to the ward, do not enter this region until it is cleaned.” The exact task description and formulas are shown in Table S3.

In summary, the system is required with six subformulas, and the final formula is φ = φ_b1 ∧ ⋯ ∧

φ b 4

φ_FV ∧ φ_JP, with the total length 27; thus, it cannot be translated into an NBA directly. The agent trajectories within 185 s are shown in Fig. 7. As shown in the right of Fig. 7, the final R-poset consists of 14 subtasks, where the part in red is calculated offline and the part in yellow is generated online. Most subtasks are executed in parallel, meaning that the R-posets can be satisfied with a high efficiency. The complete R-poset product is derived in 3.1 s, and the task assignment method TBCN is activated three times, of which the average planing time is 2.7 s. As shown in the left of Fig. 8, the Gantt graph is updated twice at 5 and 100 s during execution, and subtasks that are released online are marked by green boxes. It is worth noting that the agent movement during real execution requires more time due to motion uncertainty, communication delay, drifting, and collision avoidance. Nonetheless, the proposed method can adapt to these fluctuations and still satisfy the specified tasks, as shown in the Gantt graph of the overall execution in Fig. 8. Experiment videos are provided in the Supplementary Materials.

Discussion

Less

In this work, we propose Poset-prod, an efficient online task planning algorithm for multi-agent systems where tasks are released dynamically and constantly online. It consists of a systematic method to convert the temporal tasks into their equivalent R-posets, of which their products are computed online. Given these R-posets, an anytime task assignment algorithm is proposed to adapt the local plans of robots online such that the overall safety and correctness are ensured. The overall framework is shown to be fast and efficient, thus particularly suitable for large-scale multi-agent systems collaborating in dynamic environments.

The most significant advantage of Poset-prod is that the complexity of obtaining the first solution only grows linearly with the length of formulas. Given a set of formulas {φ_i, i ≤ m} and max_i ∣ φ_i ∣ = n, deriving the first solution has a polynomial complexity of O(m²n³). Despite that the overall complexity to compute the complete R-posets of {φ_i} is O(n^5m), it is still much smaller than the complexity of calculating the NBA of the conjunction φ via LTL2BA [42], which is O(mn · 2^mn). Thus, our method can plan for task formulas with a length of about 400 within about 50 s, while most existing methods fail at the formula length of 25.

Future work involves two directions: (a) extending the sc-LTL task formulas to general LTL and other languages such as CTL [24] and STL [25]. It remains unclear how general LTL formulas with always operators can be incorporated in the R-poset, especially with the prefix–suffix structure, and (b) considering unknown and uncertain environments that are modeled as MDPs. In this case, a reactive high-level plan is essential to take into account all possible environment behaviors.

Materials and Methods

Less

In this section, we provide the knowledge of LTL in the “Preliminaries” section, the definition of alphabets and objective function in the “Problem formulation” section, and algorithm details in the “Approach” section.

Preliminaries

As mentioned in the “Task specification and R-poset product” section of Results, the basic ingredients of LTL formulas are a set of atomic propositions AP in addition to several Boolean and temporal operators. For a given LTL formula φ, there exists an NBA as follows:

Definition 1 An NBA

A ≜ S, Σ, δ, S 0, S F

is a four-tuple, where S are the states; Σ = AP; δ : S × Σ → 2^S are transition relations; S₀, S_F ⊆ S are initial and accepting states.

An infinite word w over the alphabet 2^AP is defined as an infinite sequence W = σ₁σ₂⋯, σ_i ∈ 2^AP. The language of φ is defined as the set of words that satisfy φ, namely,

L φ = Words φ = W | W ⊨ φ

and ⊨ is the satisfaction relation. Additionally, the resulting run of w within

A

is an infinite sequence ρ = s₀s₁s₂⋯ such that s₀ ∈ S₀, and s_i ∈ S, s_i+1 ∈ δ(s_i, σ_i) hold for all index i ≥ 0. A run is called accepting if it holds that inf(ρ) ∩ S_F ≠ ∅, where inf(ρ) is the set of states that appear in ρ infinitely often. A special class of LTL formula is called sc-LTL [23,53], which can be satisfied by a set of finite sequence of words. They only contain the temporal operators ◯, U, and ⋄ and are written in positive normal form where the negation operator ¬ is not allowed before the temporal operators. A relaxed partially ordered set (R-poset) over an NBA

B φ

is defined as follows:

Definition 2 (R-poset) [54] R-poset is a three-tuple: P_φ = (Ω_φ, ⪯_φ, ≠_φ):

Ω φ = ℓ, σ ℓ, σ ℓ s, ∀ ℓ = 0, …, L

is the set of subtasks, where ℓ is the index of subtask ω_ℓ; σ_ℓ ⊆ Σ are the transition labels;

σ ℓ s ⊆ Σ

are the self-loop labels from Definition 1.

⪯_φ ⊆ Ω_φ × Ω_φ is the “less equal” relation: If (ω_h, ω_ℓ) ∈ ⪯_φ or equivalently ω_h⪯_φω_ℓ, then ω_ℓ can only be started after ω_h is started. ≠_φ ⊆ 2^Ω_φ is the “opposed” relation: If {ω_h, ⋯, ω_ℓ} ∈ ≠_φ or equivalently ω_h≠_φ⋯≠_φω_ℓ, subtasks in ω_h, ⋯, ω_ℓ cannot all be executed simultaneously.

A word is accepting if it satisfies all the constraints imposed by P_φ. Additionally, the word of R-poset will satisfy the NBA as Words(P_φ) ⊂ Words(φ).

Definition 3 (Language of R-poset) [54] Given a word w = σ′₁σ′₂⋯ satisfying R-poset P_φ = (Ω_φ, ⪯_φ, ≠_φ), denoted as w ⊨ P_φ, it holds that (a) given

ω i 1 = i i, σ i 1, σ i 1 s ∈ Ω φ

, there exist j₁ with

σ i 1 ⊆ σ j 1 ′

and

σ j 1 s ∩ σ m 1 ′ = ∅, ∀ m 1 < j 1

; (b)

∀ ω i 1, ω i 2 ∈

⪯ φ, ∃ j 1 ≤ j 2, σ i 2 ⊆ σ ′ j 2, ∀ m 2 < j 2, σ j 2 s ⊆ σ ′ m 2 = ∅

; (c)

∀ ω i 1, …, ω i n

∈ ≠ φ, ∃ ℓ ≤ n, σ i ℓ ⊆ σ ′ j 1

. Language of R-poset P_φ is the set of all word w that satisfies P_φ, denoted by

L P φ ≜ w | w ⊨ P φ

Assuming that

P b i = P 1 b i, P 2 b i, ⋯

is the set of all possible R-posets of NBA

B b i

, it is shown in Lemma 3 of our previous work [54] that (a)

L P j b i ⊆ L B b i, P j b i ∈ P φ b i

and (b)

⋃ P j b i ∈ P φ b i ‍ L P i = L B b i

. In other words, the R-posets contain the necessary information for subsequent steps.

Problem formulation

Collaborative multi-agent systems

Consider a workspace

W ⊂ ℝ 2

with M regions of interest denoted as

W

≜ {W₁, ⋯,W_M} , where W_m ∈ W. Furthermore, there is a group of agents denoted by

N

≜ {1, ⋯, N} with different types L ≜ {1, ⋯, L}. More specifically, each agent

n ∈ N

belongs to only one type l = M_type(n), where

M type : N → L

. Each type of agents l ∈ L is capable of providing a set of different actions denoted by

A l

. The set of all actions is denoted as

A a = ∪ l ∈ L ‍ A l = a 1, ⋯, a n C

. Without loss of generality, the agents can navigate between each region via the same transition graph, i.e.,

G = W, → G, d G

, where

→ ⊆ W × W

represents the allowed transitions and

d G : → G → ℝ +

maps each transition to its duration.

Moreover, there is a set of interactive objects

O

≜ {o₁, ⋯,o_U}with several types

T

≜ {T₁, ⋯,T_H} scattered across the workspace W. These objects are interactive and can be transported by the agents from one region to another. An interactive object

o u ∈ O

is described by a three-tuple:

o u ≜ T h u, t u, W u P,

(1)

where

T h u ∈ T

is the type of object; t_u ∈ R⁺ is the time when o_u appears in workspace W;

W u p : R + → W

is a function that

W u p t

returns the region of o_u at time t ≥ t_u; and

W u p t u ⊆ W

is the initial region. Additionally, new objects appear in W over time and are then added to the set

O

. With a slightly abusive notation, we denote the set of initial objects

O in

that already exist in W and the set of online objects

O on

that are added during execution, i.e.,

O = O in ∪ O on

To interact with the objects, the agents can provide a series of collaborative behaviors

C ≜ C 1, ⋯, C K

. A collaborative behavior

C k ∈ C

is a tuple defined as follows:

C k ≜ o u k, (a i, n i), i ∈ L k

(2)

where

o u k ∈ O ∪ ∅ ∅

is the interactive object if any;

a i ∈ A a

is the set of cooperative actions required; 0 < n_i ≤ N is the number of agents to provide the action a_i; and L_k is the set of action indices associated with the behavior C_k. Also, d_k denotes the execution time of C_k.

A behavior can only be executed if the required object is at the desired region. Since objects can only be transported by the agents, it is essential for the planning process to find the correct order of these transportation behaviors. Related works [50,57] build a transition system to model the interaction between objects and agents, the size of which grows exponentially with the number of agents and objects.

Task specifications

Consider the following two types of atomic propositions: (a)

p m l

is true when any agent of type l ∈ L is in region

W m ∈ W

;

p m r

is true when any object of type

r ∈ T

is in region

W m ∈ W

. Let

p ≜ p m l, ∀ W m ∈ W, l ∈ L ‍ ⋃ ‍ p m r, ∀ W m ∈ W, r ∈ R

. (b)

C k k 1, k 2 u k

is true when the collaborative behavior C_k in Eq. 2 is executed with object o_{u_k}, starting from region W_k₁ and ending in region W_k₂. Let

c ≜ {C k k 1, k 2 u k, ∀ C k ∈ C,

o u k ∈ O, ∀ W k 1, W k 2 ∈ W}

. Given these propositions, the team-wise task specification is specified as an sc-LTL formula over {p, c}:

φ = ∧ i ∈ I φ b i ∧ ∧ o u ∈ O on φ e u,

(3)

where

φ b i, i = 1, …, I, φ e u, o u ∈ O o n

are two sets of sc-LTL formulas over {p, c}. φ_{b_i} is specified in advance, while φ_{e_u} is generated online when a new object o_u is added to

O on

To satisfy the LTL formula φ, the complete action sequence of all agents is defined as

J = J 1, J 2, ⋯, J N,

(4)

where

J n ∈ J

is the sequence of

t k, C k, a k

, which means that agent n executes behavior

C k

by providing the collaborative service

a k ∈ A a

at time t_k. In turn, the sequences of actions for an interactive object is defined as:

J o = J 1 o, J 2 o, ⋯, J U o,

(5)

where

J u o = t k, C k ⋯

is the sequence of tasks associated with object

o u ∈ O

. The task pair

(t k, C k)

is added to

J u o

if object o_u joins behavior

C k

at time t_k. Assuming that the duration of formula φ_{b_i}, φ_{e_i} from being published to being satisfied is given by D_i, the average efficiency is defined as

η ≜ ∑ C k ∈ J ‍ ∣ L k ∣ d k D i,

(6)

which is the percentage of time when actions are performed.

Problem 1 Given the sc-LTL formula φ defined in Eq. 3, synthesize and update the motion and action sequence of agents

J

and objects

J o

for each agent

n ∈ N

to satisfy φ and maximize execution efficient η.

Although maximizing the task efficiency of a multi-agent system is a classical problem, the combination of interactive objects, long formulas, and contingent tasks imposes new challenges in terms of exponential complexity [42,50] and online adaptation [51].

Approach

As shown in Fig. 9, when a new R-poset is generated, the proposed solution realizes the requirement through two main components: (a) product of R-posets, where the product of existing R-posets is computed incrementally, and (b) task assignment, where subtasks are assigned to the agents given the temporal and spatial constraints specified in the R-poset.

Product of R-posets

As the first two steps shown in Fig. 9, when a new formula φ_i ∈ Φ_b, Φ_u is added, it will be transformed into NBA first. Then, an R-poset P_i is generated by the method proposed in our previous work [54]. The other R-poset P_final involved in calculation is the previous result of Poset-prod, which will be updated after this round calculation. With P_final as P₁ = (Ω₁, ⪯₁, ≠₁), P_i as P₂ = (Ω₂, ⪯₂, ≠₂), we define product of R-posets as follows:

Definition 4 (Product of R-posets) Given a finite word w₀, the product of two R-posets P₁, P₂ is defined as a set of R-posets

P r = {P 1 ′, P 2 ′, ⋯}

, denoted as

P r = P 1 ⊗ P 2

where

P i ′

satisfies two conditions: (a) if

w 0 w ∈ L P 1, w ∈ L P 2

, then

w 0 w ∈ ⋃ P ′ i ∈ P r ‍ L P ′

; (b) if

w 0 w ∈ L P ′ i

P ′ i ∈ P r

, then

w 0 w ∈ L P 1, w ∈ L P 2

The word w₀ is already executed, containing the finished subtasks

Ω finish 1

of P₁. Thus, w₀ can not influence P₂ whose subtasks will be executed in the future. Specially, if the new formula is offline as φ_i ∈ Φ_b and the agents have not started to execute subtasks, w₀ will be set as empty. As shown in Fig. 9, Poset-prod consists of the following two steps.

(a) Task composition. In this step, we generate all possible combinations of subtasks that can satisfy both Ω₁ and Ω₂. Namely, a set of subtasks Ω′ = {

ω 1 ′

, ⋯,

ω n ′

} satisfies Ω₁ if for each

ω i 1 = (i, σ i 1, σ i s 1) ∈ Ω 1

, there is a subtask

ω j ′ = j, σ j ′, σ j s ′ ∈ Ω ′

satisfying

σ i 1 ⊆ σ ′ j

and

σ i s 1 ⊆ σ j s ′

, or

ω ′ j ⊨ ω i 1

for brevity. Thus, we set Ω′ = Ω₁ first and Ω′ clearly satisfies Ω₁ with

∀ ω i 1 ∈ Ω 1, ω ′ i ⊨ ω i 1

. Then, a mapping function M_Ω : Ω₂ → Ω′ is defined to store the satisfying relationship from Ω₂ to Ω′, and

D M Ω ∈ Ω 2

is the domain of M_Ω and

R M Ω

is the range of M_Ω. As all relations are unknown, M_Ω,

D M Ω

, and

R M Ω

are initially set to empty. Namely, if

ω ′ j ⊨ ω i 2

, we will store

M Ω ω i 2 = ω ′ j

and

M Ω − 1 ω ′ j = ω i 2

, and add

ω i 2

into

D M Ω

, and

ω j ′

into R(M_Ω), where

M Ω − 1

is the inverse function of M_Ω.

DFS [58] is used to add the subtasks of Ω₂ to Ω′ in order, and these mapping relations will be recorded in M_Ω. The search sequences of DFS can be initialized as que = [(Ω′ = Ω₁, M_Ω = ∅)]. Then, during the circle, we will fetch the first node of que as (Ω′,

M Ω ′

). The next unmixed subtasks in Ω₂ are

ω i 2, i = ∣ M ′ Ω ∣ + 1

. Any subtask

ω j 1 ∈ Ω 1 / R M ′ Ω / Ω finish 1

with

ω j 1 ⊨ ω i 2

ω i 2 ⊨ ω j 1

can create a new combination based on (Ω′,

M Ω ′

Ω ̂ ′ = Ω ′, ω ̂ ′ j = j, σ j 1 ∪ σ i 2, σ j s 1 ∪ σ i s 2, M ̂ ′ Ω = M ′ Ω, M ̂ ′ Ω ω i 2 = ω ′ j,

(7)

which means that the subtask

ω j ′

in Ω′ can satisfy both

ω j 1, ω i 2

. Moreover, for the subtask

ω i 2

, we can always create a set of subtasks

Ω ̂ ′

and the corresponding mapping function

M ̂ ′ Ω

by appending

ω i 2

into

Ω ̂ ′

ω ̂ ′ j

such that

ω ̂ ′ j ⊨ ω i 2

holds, i.e.,

Ω ̂ ′ = Ω ′, j = ∣ Ω ′ ∣ + 1, ω ̂ ′ j = ω i 2; M ̂ ′ Ω = M ′ Ω, M ̂ ′ Ω ω i 2 = ω ′ j,

(8)

which means the subtask

ω j ′

can be executed to satisfy

ω i 2

. This step ends if the time budget t_b or the search sequence que exhausted. Once ∣D(

M Ω ′

) ∣ = ∣ Ω₂∣, a Ω′ satisfying Ω₁, Ω₂ is already found. In this case, the next step is triggered. As shown in Fig. 9, one of found combination is

M Ω 1 1 = 1, M Ω 1 2 = 3, M Ω 1 3 = 4

ω 1 f

is created by (7), and

ω 3 f, ω 4 f

is created by (8).

(b) Relation update. Given the set of subtasks Ω′ and the mapping function M_Ω, we calculate the partial relations among them and build a product R-poset P. First, we construct the “less-equal” constraint ⪯ as follows:

⪯ = ⪯ 1 ∪ M Ω ω ℓ 1 2, M Ω ω ℓ 2 2 | ω ℓ 1 2, ω ℓ 2 2 ∈ ⪯ 2,

(9)

which inherits both less equal relations ⪯₁ in P₁ and ⪯₂ in P₂. Then, we update Ω′ to consider the constraints imposed by the self-loop labels in other subtasks. Specifically, if a new relation (ω_i, ω_j) is added to ⪯ by

(M Ω − 1 (ω i), M Ω − 1 (ω j)) ∈ ⪯ 2

while

(ω i 1, ω j 1) ∈ ⪯ 1

holds, ω_i is required to be executed before ω_j, although

(ω i 1, ω j 1)

does not belong to ⪯₁ of P₁. In this case, σ_i and

σ i s

are updated to guarantee the satisfaction of the self-loop labels

σ j s

before executing σ_j. For each subtask ω_i, the newly added suf-subtasks from ⪯₁, ⪯₂ are defined as

S p 1, S p 2

, i.e.,

S p 1 = ω j | ω i, ω j ∈ ⪯, ω i 1, ω j 1 ∈ ⪯ 1, S p 2 = M Ω − 1 ω j | ω i, ω j ∈ ⪯, M Ω − 1 ω i 2, M Ω − 1 ω j 2 ∈ ⪯ 2,

(10)

where S¹_p, S²_p are the subtasks that should be executed after ω_i. Thus, the action labels σ_i and self-loop labels

σ i s

ω i = (i, σ i, σ i s)

are updated accordingly as follows:

σ ̂ = ⋃ ω ℓ 1 ∈ S p 1 ‍ σ ℓ s 1 ∪ ⋃ ω ℓ 2 ∈ S p 2 ‍ σ ℓ s 2, σ i s = σ ̂ ∪ σ i s, σ i = σ ̂ ∪ σ i,

(11)

in which

σ i s

and σ_i should be executed under the additional labels

σ ̂

; thus, the self-loop labels of

S p 1, S p 2

are satisfied.

Finally, we find the potential ordering by checking whether a subtask

σ i s

is in conflict with another subtask while

(ω i 1, ω j 1) ∈ ⪯ 1

. If so, an additional ordering will be added to ⪯ as:

⪯ = ⪯ ∪ (ω i, ω j) | σ j ⊨ σ i s

(12)

Then, Ω will be updated following Eqs. 10 and 11. Regarding the set of subtasks Ω that have no conflicts in ⪯, its “not-equal” relations ≠ are generated by a simple combination as:

≠ = ≠ 1 ∪ M Ω ω ℓ i | ω ℓ i ∈ ≠ 2 .

(13)

The resulting poset P_i = (Ω, ⪯ , ≠) is added to

P final

As shown in Fig. 9, Relation update gets two R-posets and the partial relations of each R-poset are succeeded from P_i, P_final. Due to the anytime property, the two-step procedure can be repeated until all possible R-posets are found or just ended when the first R-poset is found.

Task assignment

To satisfy the final R-poset P_final = (Ω_f, ⪯_f, ≠_f), the subtasks Ω_f should be executed under the partial orders of ⪯_f, ≠_f. Each subtask ω_i ∈ Ω_c represents a collaborative behavior C_j. Thus, we can redefine the action sequence of each agent

J u o ∈ J o

J u o = [(t k, ω k, a k), ⋯]

and the action sequence of each object

J u o ∈ J o

J u o = [(t k, ω k, a k), ⋯]

, in which we replace the cooperative behavior C_k with ω_k since C_k ∈ σ_k.

We propose a suboptimal algorithm called time bound contract net (TBCN) to generate and adapt the action sequence of agents and objects. Compared with the classical contract net method [55], the main differences are as follows: (a) the partial order ⪯_f, ≠_f of R-poset might be changed when new formula is added, (b) the assigned subtasks should satisfy the partial orders in ⪯_f, ≠_f, (c) the cooperative task should be fulfilled by multiple agents, and (d) interactive objects should be considered as an additional constraints. TBCN solves these differences with three steps: First, we design a cancellation mechanism in the initialization to adapt to the changes of R-poset mentioned in (a). Second, the partial orders in (b) are guaranteed by only assigning feasible subtasks but not all unassigned subtasks in each loop. Third, the constraints mentioned in (c) and (d) are considered as a time bound t₁ ∈ ℝ⁺ in the bidding process.

(a) Initialization: Once the R-poset P_final is updated by Poset-prod, we first collect the action sequence

J, J o

of previous solution and the set of finished subtasks Ω_finish from executing word w₀. Specially, all of them will be empty if it is the first round. Then, a set of essential conflict subtasks Ω_ec is defined to collect the subtasks that might conflict the updated partial orders ⪯_f, ≠_f:

Ω ec = ω i | ω i ⪯ f ω j, t i ≥ t j, ∀ ω i, ω j ∈ Ω 1 / Ω finish ∪ {ω i ∣ ω i ≠ f ω j, t j ≤ t i ≤ t j + d j, ∀ ω i, ω j ∈ Ω 1 / Ω finish} .

(14)

Then, we compute the set of subtasks Ω_conf that should be removed from

J, J o

Ω conf = {ω j ∣ ω i ⪯ f ω j, ∀ ω i ∈ Ω 1 / Ω finish, ∀ ω j ∈ Ω ec} ∪ Ω ec,

(15)

in which Ω_conf are the subtasks in Ω_ec and the subtasks whose pre-subtasks will be removed. With the action sequences

J, J o

removing all the subtasks in Ω_conf, we can initialize the set of assigned subtasks

Ω as = {ω i | ∀ (t i, ω i, a i) ∈ J}

and the set of unassigned subtasks Ω_u = Ω_f/Ω_as.

(b) Computation of feasible subtasks: After initialization, the algorithm starts a loop to assigned the subtasks in Ω_u: getting a set of feasible subtasks Ω_fe; calculating their time bounds; choosing the best subtask. For the ordering constraints ⪯_c, if (ω_j, ω_i) ∈ ⪯_c, assigning (t_{k_j}, ω_j, a_{k_j}) to a task sequence J_i = ⋯(t_{k_i}, ω_i, a_{k_i}) will violate such constraints. Thus, Ω_fe based on current Ω_as is defined as:

Ω fe = ω i | ω i ∈ Ω u, ∀ ω j, ω i ∈ ⪯ f, ω j ∈ Ω as,

(16)

in which the subtasks may lead to unfeasible action sequences being eliminated.

(c) Online bidding: Then, we will try assigning each subtask in Ω_fe and only choose the one with the best result. Without loss of generality, we assume that subtask ω_i ∈ Ω_fe requires a label

C k k 1, k 2 u k

, which means that the agents need to execute the behavior C_k from region W_k₁ to region W_k₂ using object u_k. Any constraint mentioned in (b), (c), and (d) can be considered as a time bound t₁ ∈ ℝ⁺, which means that such constraint can be satisfied after t₁. Here, we use three kinds of time bounds: the global time bound

t i ω

, the object time bound

t u k o

, and the set of local time bounds T_s. The global time bound

t i ω

is the time instance that the ordering constraints ⪯_f and conflict constraints ≠_f will be satisfied if behavior

C k k 1, k 2 u k

is executed after

t i ω

t i ω ≥ t j, ∀ j, i ∈ ⪯ f, ω j ∈ Ω as, t i ω ≥ t ℓ + d ℓ, ∀ ω i, ω ℓ, ⋯ ∈ ≠ f, ω ℓ ∈ Ω as .

(17)

For the required object u_k, assuming its participated last task is

J u k o − 1 = t ℓ, ω ℓ

, the object time bound

t u k o

should satisfy that:

W u p t = W k 1, t ≥ t ℓ + d ℓ,, ∀ t ≥ t u k o,

(18)

which means that the object u_k will be at the starting region W_k₁ of the current behavior

C k k 1, k 2 u k

and ready for it after

t u k o

. Additionally, we set

t u k o = ∞

=∞ if the object is not at W_k₁ after the action sequence

J u k o

, and we set

t u k o = 0

if the behavior

C k k 1, k 2 u k

does not require object as u_k = ∅. The set of local time bound is defined as

T s = {A n, t n a ∣ A n = A M type n ∩ A C k, t n a = t ℓ + d ℓ + d G k ℓ, k 1, ∀ n ∈ N},

(19)

where

A n

is the set of actions that agent n can provide for behavior C_k,

t n a

is the earliest time agent n can arrive region W_k₁, t_ℓ + d_ℓ is the time when the last subtask ω_ℓ ∈ J_n−[1] has finished,

d G k ℓ, k 1

is the cost of moving, and W_{k_ℓ} is the goal region of ω_ℓ.

(A n, t n a)

means agent n can begin behavior

C k k 1, k 2 u k

after time

t n a

by providing one of action

a ℓ ∈ A n

. Using these time bounds, we can determine the agents and their providing actions and generate a new party assignment

J i, J o i

to minimize the ending time of subtask ω_i. The efficiency η of each assignment

J i, J o i, ω i ∈ Ω u

is calculated, and the subtask with maximum efficiency will be chosen. Afterward, the chosen subtask is removed from Ω_un and added to Ω_as. The action sequences

J, J o

are updated accordingly as

J = J i, J o = J o i

for the next iteration.

Correctness and completeness analysis

Theorem 1 (Correctness) Given two R-posets P₁ = (Ω₁, ⪯₁, ≠₁), P₂ = (Ω₂, ⪯₂, ≠₂) generated from

B 1, B 2

, we have

L P j ⊆ L P 1 ∩ L P 2

, where

P j ∈ P final

P final = P 1 ⊗ P 2

Proof 1 If a word w = σ′₁σ′₂⋯ satisfies

P j = Ω j, ⪯ j, ≠ j,

P j ∈ P final

it satisfies the three conditions mentioned in Definition 3. In first condition, due to the step Task composition of Poset-prod, we can infer that for any

ω n 1 = n, σ n 1, σ n s 1 ∈ Ω 1,

there exists

ω n = n, σ n, σ n s ∈ Ω j

with

σ n 1 ⊆ σ n, σ n s 1 ⊆ σ n s

. Thus, we have

σ i 1 1 ⊆ σ i 1 ⊆ σ ′ ℓ 1

and

σ i 1 s 1 ∩ σ ′ m 1 = ∅, ∀ m 1 < ℓ 1

, which indicates that w satisfies P₁ for condition 1. For the second condition, due to Eq. 9 in step Relation update, we have ⪯_i ⊆ ⪯_j. Thus, we can infer that w satisfies P₁ for the second condition: Any

σ i 1 s 1 ∩ σ ′ m 1 = ∅, ∀ m 1 < ℓ 1

, we have (ω_i₁, ω_i₂) ∈ ⪯_j, thus

∃ ℓ 1 ≤ ℓ 2,

σ i 2 1 ⊆ σ i 2 ⊆ σ ′ ℓ 2

, and

∀ m 2 < ℓ 2, σ ℓ 2 s 1 ⊆ σ ℓ 2 s ⊆ σ ′ m 2

. Additionally, for the last condition, as the word w satisfied the ≠_j order of P_j. We have ≠₁ ⊆ ≠_j due to Eq. 13. Thus, the word w also satisfied the third condition. In the end, we can conclude that w satisfies P₁. In the same way, we can proof the w also satisfies P₂. Thus,

L P j ⊆ L P 1 ∩ L P 2 .

Theorem 2 (Completeness) Given two R-posets P₁, P₂ getting from

B 1, B 2

, with enough time budget, Poset-prod returns a set

P final

consisting of all final product, and its language

L P final = ⋃ P i ∈ P final ‍ L P i

is equal to

L P final =

L P 1 ⋂ ‍ L P 2

Proof 2 Due to Theorem 1, it holds that

L P final ⊆

L P 1 ⋂ ‍ L P 2

. Thus, we only need to show that

L P 1 ⋂ ‍

L P 2 ⊆ L P final

. Given a word w = σ′₁σ′₂⋯ and

w ∈ L P 1 ⋂ ‍

L P 2

, w satisfies the first condition in Definition 3 for both P₁ and P₂ that:

∀ ω i 1 1 = i 1, σ i 1 1, σ i 1 s 1 ∈ Ω 1

, there exists σ′_j₁ with

σ i 1 1 ⊆ σ ′ j 1

and

σ j 1 s 1 ⊆ σ ′ m 1, ∀ m 1 < j 1

;

∀ ω i 2 2 = i 2, σ i 2 2, σ i 2 s 2 ∈ Ω 2

, there exists σ′_j₂ with

σ i 2 2 ⊆ σ ′ j 2

and

σ j 2 s 2 ⊆ σ ′ m 2, ∀ m 2 < j 2

. If j₁ = j₂, the step in Eq. 8 of Task composition will generate a subtask ω_i₁ ∈ Ω_j with

ω i 1 ⊨ ω i 1 1, ω i 2 2

, and

σ i 1 ⊆ σ ′ j 1, ∀ m 3 < j 1, σ j 1 s ⊆ σ ′ m 3

. If j₁ ≠ j₂, the (7) will generate

ω M Ω i 2 ⊨ ω i 2 2

, with

σ M Ω i 2 ⊆ σ ′ j 2, ∀ m 4 < j 2, σ M Ω i 2 s ⊆ σ ′ m 4

. Thus, there exists

P j ∈ P final

that satisfies the first condition. For the second condition, ⪯_j consisting of two parts generated by Eqs. 9 and 12 guarantees that

w ∈ L P 1 ∩ L P 2

. Moreover, Eqs. 10 and 11 guarantee that w does not conflict the self-loop constraints of P₁, P₂. Thus, the second condition is satisfied. Regarding the third condition, since ≠_j = ≠₁ ∩ M_Ω(≠₂) holds in Eq. 13, ≠_j is naturally satisfied by w. In conclusion, for any

w ∈ L P 1 ∩ L P 2

w ∈ L P final

holds and vice versa. Thus,

L P final

is equal to

L P 1 ⋂ ‍ L P 2

Funding

Less

National Natural Science Foundation of China (U2241214, 62373008, 62203017, and T2121002)

References

Less

Jemal

, Kechaou

, Ayed

, Alimi

. A multi agent system for hospital organization. Intl J Mach Learn Comput. 2015;5(1):51–56.

Cliff OM, Fitch R, Sukkarieh S, Saunders DL, Heinsohn R. Online localization of radio tagged wildlife with an autonomous aerial robot system. Paper presented at: Robotics: Science and Systems; 2015 Jul; Rome, Italy.

Zhang

, Hammad

, Bahnassi

. Collaborative multi-agent systems for construction equipment based on real-time field data capturing. J Inform Technol Constr (ITcon). 2009;14:204–228.

Arai

, Pagello

, Parker

. Advances in multi-robot systems. IEEE Trans Robot Autom. 2002;18:655–661.

Toth

, Vigo

. An overview of vehicle routing problems. In: The vehicle routing problem. Philadelphia, PA: SIAM; 2002. p. 1–26.

Fink J, Hsieh MA, Kumar V. Multi-robot manipulation via caging in environments with obstacles. Paper presented at: 2008 IEEE International Conference on Robotics and Automation. 2008; Pasadena, CA, USA.

Arm

, Waibel

, Preisig

, Tuna

, Zhou

, Bickel

, Ligeza

, Miki

, Kehl

, Kolvenbach

, et al. Scientific exploration of challenging planetary analog environments with a team of legged robots. Science. Robotics. 2023;8(80):eade9548.

Varava

, Hang

, Kragic

, Pokorny

. Herding by caging: A topological approach towards guiding moving agents via mobile robots. Paper presented at: Robotics: Science and Systems; 2017; Cambridge, MA, USA.

Ozkan-Aydin

, Goldman

. Self-reconfigurable multilegged robot swarms collectively accomplish challenging terradynamic tasks. Sci Robot. 2021;6(56):eabf1628.

10.

Ruan

, Duan

, Sun

, Yuan

, Xia

. Multiplayer reach–avoid differential games in 3D space inspired by Harris' hawks' cooperative hunting tactics. Research. 2023;6:0246.

11.

Kartik

, Murthy

, C. Task allocation algorithms for maximizing reliability of distributed computing systems. IEEE Trans. Comput. 1997;46(6):719–724.

12.

Agrawal P, Varakantham P, Yeoh W. Scalable greedy algorithms for task/resource constrained multi-agent stochastic planning. Paper presented at: Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016 July 9; New York.

13.

Keshanchi

, Souri

, Navimipour

. An improved genetic algorithm for task scheduling in the cloud environments using the priority queues: Formal verification, simulation, and statistical testing. J Syst Softw. 2017;124:1–21.

14.

, Zhang

, Yang

. Multi-AUV autonomous task planning based on the scroll time domain quantum bee colony optimization algorithm in uncertain environment. PLOS ONE. 2017;12(11): Article e0188291.

15.

, Xiao

. Flexible wolf pack algorithm for dynamic multidimensional knapsack problems. Research. 2020;2020:1762107.

16.

Yan M, Yuan H, Xu J, Yu Y, Jin L. Task allocation and route planning of multiple UAVs in a marine environment based on an improved particle swarm optimization algorithm. EURASIP J Adv Signal Process. 2021;1–23.

17.

Biswas S, Anavatti SG, Garratt MA. Particle swarm optimization based co-operative task assignment and path planning for multi-agent system. Poster presented at: 2017 IEEE Symposium Series on Computational Intelligence (SSCI). 2017; Honolulu, Hawaii, USA.

18.

Wells

, Dantam

, Shrivastava

, Kavraki

. Learning feasibility for task and motion planning in tabletop environments. IEEE Robot Autom Lett. 2019;4(99):1255–1262.

19.

Omidshafiei S, Pazis J, Amato C, How JP, Vian J. Deep decentralized multi-task multi-agent reinforcement learning under partial observability. Paper presented at: Proceedings of the 34th International Conference on Machine Learning. 2017; Sydney, Australia.

20.

Liu M, Ma H, Li J, Koenig S. Task and path planning for multi-agent pickup and delivery. Paper presented at: Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS). 2019; Montreal, QC, Canada.

21.

Zhang H, Du W, Shan J. Building cooperative embodied agents modularly with large language models. arXiv. 2023. arXiv:2307.02485.

22.

Ruan J, Chen Y, Zhang B. Tptu: Task planning and tool usage of large language model based ai agents. arXiv. 2023. arXiv:2308.03427.

23.

Baier

, Ketoen

. Principles of model checking. Cambridge, MA: MIT Press; 2008.

24.

Koymans

. Specifying real-time properties with metric temporal logic. Real-Time Syst. 1990;2:255–299.

25.

Maler

, Nickovic

. Monitoring temporal properties of continuous signals. In: International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems.Heidelberg, Germany: Springer; 2004. p. 152–166.

26.

Luo

, Kantaros

, Zavlanos

. An abstraction-free method for multirobot temporal logic optimal control synthesis. IEEE Trans Robot. 2021;37(5):1487–1507.

27.

Guo

, Dimarogonas

. Task and motion coordination for heterogeneous multiagent systems with loosely coupled local tasks. IEEE Trans Autom Sci Eng. 2016;14(2):797–808.

28.

Luo

, Zavlanos

. Temporal logic task allocation in heterogeneous multi-robot systems. IEEE Trans Robot. 2022;38(6):3602–3621.

29.

Sahin

, Nilsson

, Ozay

. Multirobot coordination with counting temporal logics. IEEE Trans Robot. 2019;36(4):1189–1206.

30.

Jones

, Leahy

, Vasile

. ScRATCHS: Scalable and robust algorithms for task-based coordination from high-level specifications. Proc Int Symp Robot Res. 2019;38(4):1–16.

31.

Schillinger

, Bürger M, and Dimarogonas DV. Decomposition of finite LTL specifications for efficient multi-agent planning. Paper presented at: International Symposium on Distributed Autonomous Robotic Systems;2016; London, UK.

32.

Schillinger

, Burger

, Dimarogonas

. Simultaneous task allocation and planning for temporal logic goals in heterogeneous multi-robot systems. Int J Robot Res. 2018;37(7):818–838.

33.

Kantaros

, Zavlanos

. Stylus*: A temporal logic optimal control synthesis algorithm for large-scale multi-robot systems. Int J Robot Res. 2020;39:812–836.

34.

, Yin

, Li

. Security-preserving multi-agent coordination for complex temporal logic tasks. Control Eng Pract. 2022;123(1): Article 105130.

35.

Lli

, Chen

, Wang

, Kan

. Fast task allocation of heterogeneous robots with temporal logic and inter-task constraints. IEEE Robot Autom Lett. 2023;8(8):4991–4998.

36.

Bonnet J, Gleizes MP, Kaddoum E, Rainjonneau S, Flandin G. Multi-satellite mission planning using a self-adaptive multi-agent system. Paper presented at: 2015 IEEE 9th International Conference on Self-Adaptive and Self-Organizing Systems. 2015; Cambridge, MA, USA.

37.

Yang Q, Luo Z, Song W, Parasuraman R. Self-reactive planning of multi-robots with dynamic task assignments. Paper presented at: 2019 International Symposium on Multi-Robot and Multi-Agent Systems (MRS). 2019; Boston, MA, USA.

38.

Faroni

, Umbrico

, Beschi

, Orlandini

, Cesta

, Pedrocchi

. Optimal task and motion planning and execution for multiagent systems in dynamic environments. IEEE Trans Cyber. 2023;1–12.

39.

Choudhury

, Gupta

, Kochenderfer

, Sadigh

, Bohg

. Dynamic multi-robot taskallocation under uncertainty and temporal constraints. Auton Robots. 2022;46:231–247.

40.

Tian

, Fang

, Yang

, Guo

, Cui

, Liang

, Wu

. Two-phase motion planning under signal temporal logic specifications in partially unknown environments. IEEE Trans Ind Electron. 2023;70(7):7113–7121.

41.

Ben-Ari

. A primer on model checking. ACM Inroads. 2010;1(1):40–47.

42.

Gastin P, Oddoux D. Fast LTL to Bu¨chi Automata Translation. Paper presented at: Proceedings of the 13th International Conference on Computer Aided Verification. 2002; Copenhagen, Denmark.

43.

Ding

, Smith

, Belta

, Rus

. Optimal control of Markov decision processes with linear temporal logic constraints. IEEE Trans Automat Contr. 2014;59(5):1244–1257.

44.

Kloetzer M, Mahulea C. Accomplish multi-robot tasks via Petri net models. Paper presented at: 2015 IEEE International Conference on Automation Science and Engineering (CASE). 2015; Gothenburg, Sweden.

45.

Leahy

, Serlin

, Vasile

, Schoer

, Jones

, Tron

, Belta

. Scalable and robust algorithms for task-based coordination from high-level specifications (ScRATCHeS). IEEE Trans Robot. 2022;38(4):2516–2535.

46.

Schillinger

, Burger

, Dimarogonas

. Hierarchical LTL-task mdps for multi-agent coordination through auctioning and learning. Intl J Robot Res. 2019;153:104085.

47.

Kantaros

, Zavlanos

. Sampling-based optimal control synthesis for multirobot systems under global temporal tasks. IEEE Trans Automat Contr. 2018;64(5):1916–1931.

48.

Piterman

, Pnueli

, Sa'ar

. Synthesis of reactive(1) designs. J Comput Syst Sci. 2006;78(3):364–380.

49.

Vasilopoulos

, Kantaros

, Pappas

, Koditschek

. Reactive planning for mobile manipulation tasks in unexplored semantic environments. Paper presented at: International Conference on Robotics and Automation; 2021; Xi'an, China.

50.

Verginis

, Dimarogonas

. Multi-agent motion planning and object transportation under high level goals. IFAC World Congress; Sydney, Australia, 2018.

51.

Lacerda B, Parker D, Hawes N. Optimal and dynamic planning for Markov decision processes with co-safe LTL specifications. Paper presented at: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2014; Chicago, IL, USA.

52.

Feyzabadi S, Carpin S. Multi-objective planning with multiple high level task specifications. Paper presented at: 2016 IEEE International Conference on Robotics and Automation (ICRA). 2016; Stockholm, Sweden.

53.

Belta

, Yordanov

, Gol

. Formal methods for discrete-time dynamical systems. Heidelbery, Germany: Springer; 2017.

54.

Liu

, Guo

, Li

. Time minimization and online synchronization for multi-agent systems under collaborative temporal logic tasks. Automatica. 2024;159: Article 111377.

55.

Smith. The contract net protocol: High-level communication and control in a distributed problem solver. IEEE Trans. Comput. 1980;C-29(12):1104–1113.

56.

Faruq F, Parker D, Laccrda B, and Hawes N. Simultaneous task allocation and planning under uncertainty. Paper presented at: 2018 IEEE/RSJ international conference on intelligent robots and systems (IROS). 2018; Madrid, Spain.

57.

Verginis

, Dimarogonas

. Multi-agent motion planning and object transportation under high level goals. IFAC-PapersOnLine. 2017;50:15816–15821.

58.

Kozen

, Kozen

. Depth-first and breadth-first search. Design Anal Algorithms. 1992;19–24.

Appendix

Less

Year 2024 volume 7 Issue 3

PDF

147

Cite this Article

BibTeX

Article Info

doi: 10.34133/research.0337

Receive Date：2024-01-02
Online Date：2025-07-24
Published：2024-03-20

Article Data

Affiliations

History

Received：2024-01-02
Accepted：2024-02-17

Funding

National Natural Science Foundation of China (U2241214, 62373008, 62203017, and T2121002)

Affiliations

¹Department of Mechanics and Engineering Science, College of Engineering, Peking University, Beijing 100871, China.

² Science and Technology Commission of China Aerospace Science and Technology Corporation, Beijing 100048, China.

Corresponding:

^* Address correspondence to: zhongkli@pku.edu.cn

References

Share

https://castjournals.cast.org.cn/joweb/research/EN/10.34133/research.0337

Share to

Scan QR to access full text

Cite this article

BibTeX

Citations

表12种不同金属材料的力学参数

科 Family	属数 Number of genus	种数 Number of species	占总种数比例 Percentage of total species (%)	属 Genus	种数 Number of species	占总种数比例 Percentage of total species (%)
鹅膏菌科Amanitaceae	2	11	5.26	鹅膏菌属 Amanita	10	4.78
小菇科 Mycenaceae	2	12	5.74	丝盖伞属 Inocybe	5	2.39
多孔菌科 Polyporaceae	8	14	6.70	蜡蘑属 Laccaria	5	2.39
红菇科 Russulaceae	3	23	11.00	小皮伞属 Marasmius	6	2.87
				小菇属 Mycena	11	5.26
				光柄菇属 Pluteus	5	2.39
				红菇属 Russula	17	8.13
				栓菌属 Trametes	5	2.39

关闭全屏

BibTeX
EndNote
RefWorks
TxT

Fig. 1. Example of Poset-prod associated with four formulas $φ 1 = ⋄ D w 7, w 7 ∅ ∅ ∧ ⋄ C w 7, w 7 ∅ ∅ ∧ ¬ M w 7, w 7 ∅ ∅ ∧ ⋄ M w 7, w 7 ∅ ∅$ , $φ 2 = ⋄ C w 7, w 7 ∅ ∅ ∧ ¬ G w 7, e 3 1 ∧ ⋄ G w 7, e 3 1 ∧ ¬ D w 7, w 7 ∅ ∅ U C w 7, w 7 ∅ ∅$ , $φ 3 = ⋄$ $C w 3, w 3 ∅ ∅ ∧ ¬ T w 3, e 2 5 ∧ ⋄ D w 3, w 3 ∅ ∅ ∧ ⋄ T w 3, e 2 5$ , and $φ 4 = ⋄ T w 1, o 1 2 ∧ ¬ P o 1, o 1 2 ∧ ⋄ P o 1, o 1 2 ∧ ¬ R o 1, o 1 ∅ ∅ ∧ ⋄ T o 1, w 1 2 ∧ ⋄ R o 1, o 1 ∅ ∅$ . Left: $B 1, ⋯, B 4$ are the NBAs of formula φ₁, ⋯, φ₄. Middle: P₁, ⋯, P₄ are the R-posets of $B 1, ⋯, B 4$ . Right: $B 1, ⋯, B 4$ P₁ ⊗ P₂ = {P_f1, P_f2}, $P f 3 ∈ P f 1 ⊗ P 3 and P f 4 ∈ P f 3 ⊗ P 4$ .

Fig. 2. The bidding process of assigning subtask 4 under the global time bound (when the partial relations in R-poset are satisfied), the time bound for object (when the object is reachable), and local time bounds (when the agents get ready).

Fig. 3. Illustration of computing the product posets. Left: Final R-poset computed from the initial subformulas and online subformulas (dashed lines). Right: $P b 1, ⋯, P b 6$ are the R-posets associated with the initial subformulas, and $P u 1, ⋯, P u 5$ are associated with the online subformulas.

Fig. 4. (A, C, E,and G): Snapshot of agent trajectories at 40, 180, 215, and 400 s when new tasks are added online. (B, D, F, and H): Gantt graph of task assignment at these time instants, as highlighted in green boxes.

Fig. 5. The computation time (left) and the execution efficiency η (right) with respect to different number of agents and subtasks.

Fig. 6. The comparison of the computation time (left) and memory (right) by different methods.

Fig. 7. The trajectories of agents in experiment (left), experimental snapshot (middle), and the final R-poset (right).

Fig. 8. The Gantt graph of the planned execution (left) and the Gantt graph of the actual execution (right).

Fig. 9. Framework of the proposed method. For a new input formula φ_i, we first change it into an R-poset P_i with the method in the “Preliminaries” section. Then, Poset-prod between P_i and P_final is calculated as in the “Product of R-posets” section. Third, after the R-poset P_final is updated, the TBCN method in the “Task assignment” section determines and adjusts the task sequence of each agent and object. New tasks are released when the agents execute their local plans and detect new objects online.

Articles: Latest Articles; Most Read; Collections

Updates: Events; News; Multimedia

About: About Us

Contact

No. 86 Xueyuan South Road, Haidian District, Beijing

100081

010-62199257

qkjq@cast.org.cn

Copyright © 2025 China Association for Science and Technology. All rights reserved. For all open access content, the relevant licensing terms apply.
Sponsored by the Office of the Leading Group for Cybersecurity and Informatization of CAST, and supported by Science and Technology Review Publishing House