INTRODUCTION
A reactive embedded system must respond to the inputs from the environment at the speed and with the delay dictated by the environment. The embedded software coordinates system operations. Scheduling of these operations is subject to satisfy hard time constrains and make most use of the system resources (minimize CPU idle time).
SYSTEM, PROCESSES AND TASKS
We define an embedded system as a set of concurrent processes. A set of input and output ports are defined for each process and pointtopoint communication between processes occurs through unidirectional channels between ports. We support multirate communication, in which the number of objects read or written by a process at any given time may be an arbitrary constant. A process may communicate with the environment in which the system is executed. This is done through input and output ports for which no channel is defined. Such primary input ports can belong to one of two classes, which we call controllable and uncontrollable. The latter is used to trigger an execution of the system, i.e., when the system receives an object at an uncontrollable port, it reacts to the environment by performing operations. On the other hand, the system may request the environment for further inputs through controllable ports, while this is not allowed for uncontrollable ports. Output ports are always written under system control and the environment must be ready to accept them at any time (as allowed by the concurrent process specification). We restrict our attention to processes described as sequential programs and whose implementation is mapped as software to be executed on a programmable processor. The sequential program for each process is specified in Flow C, which is based on C, but extended to specify communication operations. It may contain both datadependent control statements (e.g., ifthenelse or arbitrary while or for loops) and synchronization dependant control statements (e.g., SELECT). A task is generated for each input uncontrollable port, which performs the operations that required to react to an event of that port.
Static, quasistatic and dynamic scheduling: Static scheduling does most of the work at compile time, so the resulting software behavior is highly predictable and the overhead due to task context switching is minimized. They may also achieve very high.
2 CPU utilization if the rate of arrival of inputs to be processed from the environment has predictable regular rates that are reasonably known at compile time. Static scheduling, however, is limited to specifications without runtime choice called Marked Graphs or Static Dataflow (Cortadella et al., 2000).
Researchers have recently started looking into ways of computing a static execution order for operations as much as possible, while leaving data or synchronizationdependant choices at runtime. This body of work is known as Quasi Static Scheduling (QSS).
Dynamic scheduling is commonly used for a system which contains realtime controls and the run time behavior is heavily dependent on the occurrence of external events.
Petri net: A Petri net (PN) is defined by a tuple of (P, T, F, Mo), where P, T are sets of places and transitions respectively. F is a function from (TxP) (PxT) to nonnegative integers. A marking is a function from P to nonnegative integers, where its output for a place is denoted by M[p], which we call the number of tokens at p in M. If M[P] is positive, the place is said to be marked at M. Mo is a marking, which we refer to as the initial marking. A Petri net can be represented by a directed bipartite graph, where an edge [u,v] exists if
F([u,v]) is positive, which is called the weight of the edge. We may call v a successor of u and u a predecessor of v, respectively. A transition is enabled at a marking M[p], if M[p]≥F([p,t]), for all p of P.
In this case, one may fire the transition at the marking, which yields a marking
M’ given by: M’[p] = M[p] – F(p,t) + F(t,p), for each p of P.
In the sequel, M
M’ denotes the fact that a transition t is enabled at a marking M and M’
is obtained by firing t at M. A sequence of transitions (t1,t2,…tk) is
said to be fireable from a marking M, if there exists a sequence of markings
(M1, M2,…Mk) such that M1 = M and Mi
Mi+1’ holds for each I=1,2,…k. A transition is said to be a source,
if F(p,t) = 0 for all p of P. A marking M’ is said to be reachable from
M, if there is a sequence of transitions fireable from M that leads to M’.
The set of markings reachable from the initial marking is denoted by R(M0).
The reachability graph of a Petri net is a directed graph in which R(M0) is
the set of nodes and each edge [M,M’] is a transition t with.
The reachability tree of a Petri net is a tree in which each node is labeled
with a marking of R(M0), the root node is labeled with M0 and each edge [v,v’]
represents a transition with M→M’.
A key notion we use in Petri nets for defining schedules is equal conflict sets. A pair of nonsource transitions ti and tj is said to be in equal conflict, if F([p,ti]) = F([p,tj]), for all p of P. These transitions are in conflict in the sense that ti is enabled at a given marking, if and only if tj is enabled, i.e., if the firing of one transition disables ti, it also disables tj. The equal conflict is an equivalence relation defined on the set of nonsource transitions and each equivalence class is called Equal Conflict Set (ECS). As a special case, we also define as an ECS a set that consists of a single source transition. By definition, if one transition of an ECS is enabled at a given marking, all the other transitions of the ECS are also enabled. Thus, we may say that this ECS is enabled at the marking.
A place is said to be a choice place if it has more than one successor transition. A choice place is Equal Choice (a generalization of free choice) (Sgroi et al., 1999) if all the successor transitions are in the same ECS. A Petri net is Equal Choice if all choice places are equal. A choice place is unique if no more than one successor transition can be enabled in any of the markings of R(M0). A Uniquechoice Petri Net (UCPN) is that in which all choice places are either equal or unique.
Overview: A system function is represented as a network of processes. A process is specified as a sequential program written in FlowC. The network of processes is transformed into a single Petri net, which is built in two steps. In the first step, called compilation, a Petri net for each process is constructed and each port is associated to a place of the Petri net.
The second step, called linking, builds a Petri net by connecting the Petri
nets according to the defined channels. The scheduling algorithm takes as input
a Petri net and an uncontrollable source transition for which a schedule is
sought. The goal of code generation is to synthesize a sequential program to
be run on a microprocessor which implements the system behavior, based on the
schedule computed with the algorithm.
SCHEDULING ALGORITHM
A task is generated for each uncontrollable input port and thus we compute a schedule for each uncontrollable source transition. A schedule for a given uncontrollable source transition is a directed graph.
Definition and properties of schedules: A schedule is a representation of all possible execution flows of the tasks that can be executed with finite memory for arbitrary input streams. It has five properties:
• 
Exactly one node (vertex) for Mo 
• 
The set of transitions associated with the outgoing edges of node v is
an ECS enable at M(v) 
• 
holds, for each edge [v,w] 
• 
Each node has at least one path to an await node 
• 
Each await node is on at least one cycle 
Scheduling algorithm: The basic steps of the algorithm are given below.
• 
Create the root r and set M(r) = Mo 
• 
Create a node v and an edge [r, v] 
• 
Associate the transition t1 to the edge and a marking with v, s.t.
holds 
• 
Call function EP (v,r), if return r, schedule found, then call postprocessing
and terminate 
• 
Termination condition: The irrelevance criterion 
Two basic functions are given as:
Heuristics using tinvariants: To find an entry point of a node, the
algorithm may explore all possible ECS’s at the node until it finds a desired
one. The number of nodes created in the algorithm depends on the order of ECS’s
explored. Although the ordering does not influence the worstcase search space
or run time of the algorithm, some orderings help finding a desired entry point
sooner than others. We present a heuristic approach of sorting the ECS’s
for this purpose. The heuristic helps keeping the resulting schedules small.
It also provides us with a sufficient nonschedulability condition and if the
condition holds, we can immediately terminate the procedure, reporting no schedule.
The heuristic method tries to find a short sequence of transitions such that if the sequence is fired from the current node, a marking associated with some ancestor of the node can be obtained. such an ancestor becomes a candidate of an entry point returned by the function EP for the node. We use tinvariants in finding such a sequence.
A tinvariant is a vector of nonnegative integers that solves the system of homogeneous marking Eq. C.X = 0, where C = [C_{i.j} = F (t_{j}, P_{j})  F (P_{i}, t_{j}). C is the incidence matrix of the Petri net. A tinvariant represents a set of sequences in which the number of occurrences of the ith transition is given by the integer at the ith position of the tinvariant. We say that such a sequence is contained in the tinvariant.
Each of such sequences has a property that if the sequence can be fired from a marking M, the marking obtained after the firing is also M. We call a nonnegative basis of the homogeneous marking equations a base of tinvariants. Such a basis can be obtained by using a Smith Normal Form decomposition.
It is known that a schedule does not exist if there is no base of tinvariants.
Therefore, if the algorithm identifies this case, it terminates immediately
without applying the function EP. If a base of tinvariants is found, using
the base and a sequence of transitions associated with the path from the root
to the current node, the heuristic computes a vector of nonnegative integers
called the promising vector in EP. The positions of the vector correspond to
the transitions and those with positive integers represent transitions to be
fired from the current node. The ECS’s are sorted using this vector.
If we have several enabled ECSs, which all have transitions appear in the tinvariants, then sorting tinvariants can help to find a better tinvariant as the promising vector which leads to a cycle. Our heuristic sorts tinvariants according to their length. The length of a tinvariant is defined as the total number of firings in the tinvariant. We choose the shortest tinvariant, if possible. Because based on experience, it is found that a shorter tinvariant has a better chance to make a cycle and produce a small schedule, thus small code size. Reusing firing sequence is also an approach to reduce the resulting code size, even though it may not reduce the size of the schedule. Our current algorithm only consider the cyclic firing sequence from and to await nodes, because we can find a necessary and sufficient condition of friability of a given firing sequence at await nodes. Generally, we only have a necessary condition. So first, we record all the cyclic firing sequences into a dictionary, then we come to an await node, we simply check the friability of previously fired cyclic firing sequences at that await node. If it is firable, we force the await node fire that sequence by give the associated ECS the highest priority. If it is not firable, we simply use the tinvariant to choose the associated ECS to be fire. One thing about checking friability is that we check not just the friability of a specific cyclic firing sequence, but also the parasitic firing sequences associated to the cyclic firing sequence.
This is because of the atomicity of ECS. It means that if a transition t appears in a schedule, then all the transition in the ECS of t will also appear in the schedule. Another consideration is that if we have several await nodes associated with the cyclic firing sequence, then we need to identify all the firable cyclic firing sequences. This is because if we have more than await node then several firable cyclic firing sequences may only have one cycle identified by the schedule.
IMPLEMENTATION AND EXPERIMENTS
Introduction to QSS1.2: The C source code of QSS1.2 has been modified to implement the ideas proposed in this report. The entire software synthesis flow from specifications to synthesized tasks, which consists of compiler, linker, scheduler and code generator, are implemented in QSS1.2 in a set of tools. The complete software implementation of the system, including the realtime scheduling of tasks, is supposed to be managed by an embedded operating system. QSS1.2 software package consists mainly four components: FCC, PNL, QSS and CGEN, (Fig. 1). FCC is a FlowC Compiler program. It accepts users’ input FlowC program and generates an intermediate petri net representation. PNL is the Petri Net Linker program.
It composes the intermediate representations generated by FlowC Compiler and output is a petri net file using hpn format. QSS is the Quasi Static Schedule program that finds a subtree of the reachability tree from a given Petri net. QSS output the subtree by using a hpn format and a dot format. CGEN is the Code Generator program that synthesizes code for a given schedule. CGEN output files with Poindexter C format, which is supported by Cadence VCC codesign environment. Therefore the synthesized code can be simulated and estimated in VCC. In addition, the QSS1.2 software includes some other supporting tool modules, for example, the manipulation tool of hierarchical Petri nets, the Tinvariance analysis tool, etc.
The simulation and verification of the synthesis code can be done in Cadence VCC codesign environment by importing the code into VCC.
Link list for fire sequences: In order to reuse previous successful fire sequences, a dynamic data structure is used to store the fire sequences. The linked list works very well in this case. Now the problem is how to store the fire sequences such that
• 
Fire sequences can be easily accessed and manipulated, for
example, add a new sequence, compare two fire sequences, record the frequency
of the fire sequences. 
• 
The important context information about each transition should not be
lost. 
• 
In addition, we hope to reuse previous code as much as possible and keep
the original structure of QSS1.2. 

Fig. 1: 
Structure of QSS1.2 
After carefully consideration, we define each item in the fire sequences link
list as following structure:
Actually the first field is the redundant information of second field. However,
the separation between these two fields provides great advantages. The operations
on fire sequences are classified into link list management function and fire
sequence analysis function. The link list management function on fire sequences
could be adding a fire sequence, deleting a fire sequence, retrieving a fire
sequence, etc., which can be easily implemented through string manipulation.
The fire sequence analysis function could be testing whether current node can
find a cycle by current sequence, etc, which can be simplified through transitions.
Experiments and results: The new implementation have been used to synthesis our demonstration example and examples given in the QSS1.2 software package. We compare both the schedule size and code size for the system.
The preliminary experiment results are presented in Table 1.
Table 1: 
Experiment results for code size. All values are in bvtes 

* No code generated (the input is Petri Net Flow C code) 
We find that both the schedule size and the code size generated by the new
implementation are smaller than or at most equal to those generated by QSS1.2
for some examples.
It shows that our ideas can reduce the code size reasonably in some cases. However, it is only a very limited experiment. More examples have to be tested and more experiments should be done.
CONCLUSIONS AND FUTURE WORK
In this report, we propose several ideas to optimize the single source task generation and compiletime scheduling for mixed datacontrol embedded system. Those ideas can be further formalized with appropriate theoretical analysis. Current implementation is just a preliminary step. It needs to be refined and cleaned. In addition, the experiments and measurement is fairly straightforward and simple with limited examples. More experiments will be done with different kind of examples. Furthermore, the synthesized code should be imported into VCC and functionally simulated to verify the correctness of the algorithm.
ACKNOWLEDGMENTS
We would like to thank Dr. Yosinori Watanabe and Dr. Alex Kondratyev for their great help proposed some ideas of this study and provided the QSS1.2 source code.