3.1. Dataflow Modeling
 Dataflow modeling involves representing an application using a directed graph G(V, E), where V is a set of vertices (nodes) and E is a set of edges. Each vertex u ∈ V in a dataflow graph is called an actor, and represents a specific computational block, while each directed edge (u, v) ∈ Erepresents a first-in-first-out (FIFO) buffer that provides a communication link between thesource actor u and the sink actor v. A dataflow graph edge ecan also have a non-negative integerdelay, del(e), associated with it, which represents the number of initial data values (tokens) present in the associated buffer. Dataflow graphs operate based on data-driven execution, where an actor can be executed (fired) whenever it has sufficient amounts of data (numbers of “samples” or data “tokens”) available on all of its inputs. Typically, in DSP-oriented data flow design environments, the execution of a dataflow graph can be thought of as that of a “globally asynchronous locally synchronous” (GALS) system [Suhaib et al., 2008; Shen and Bhattacharyya, 2009].
 During each firing, an actor consumes a certain number of tokens from each input and produces a certain number of tokens on each output. When these numbers are constant (over all firings), we refer to the actor as a synchronous dataflow (SDF) actor [Lee and Messerschmitt, 1987]. For an SDF actor, the numbers of tokens consumed and produced in each actor execution are referred to as the consumption rate and production rate of the associated input and output, respectively. If the source and sink actors of a dataflow graph edge are SDF actors, then the edge is referred to as an SDF edge, and if a dataflow graph consists of only SDF actors, and SDF edges, the graph is referred to as an SDF graph.
 For a dataflow graph edge e, src(e) and snk(e), denote its source and sink actors, and if e is an SDF edge, then prd(e) denotes the production rate of the output port of src(e) that is connected to e, and similarly, cns(e) denotes the consumption rate of the input port of snk(e) that is connected to e.
 A static schedule for a dataflow graph G is a sequence of actors in G that represents the order in which actors are fired during an execution of G.
 Usually, production and consumption information — in particular, the number of tokens produced and consumed (production/consumption volume) — by individual firings is characterized in terms of individual input and output ports so that each port of an actor can in general have a different production or consumption volume characterization. Such characterizations can involve constant values as in SDF [Lee and Messerschmitt, 1987] (as described above); periodic patterns of constant values, as in cyclo-static dataflow (CSDF) [Bilsen et al., 1996]; or more complex forms that are data-dependent [e.g., seeBuck, 1993; Bhattacharya and Bhattacharyya, 2000; Murthy and Lee, 2002; McAllister et al., 2004; Plishker et al., 2008]. A meta-modeling technique called parameterized dataflow (PDF) allows limited forms of dynamic behavior [Bhattacharya and Bhattacharyya, 2000] in terms of run-time changes to dataflow graph parameters. The Boolean dataflow (BDF) [Buck, 1993] and core functional dataflow (CFDF) [Plishker et al., 2008] models are highly expressive (Turing complete) dynamic dataflow models. We have explained SDF, CSDF, and PDF models in greater detail later in this section.
 Apart from DIF, which we have mentioned earlier, there are various existing design tools with their semantic foundations in dataflow modeling, such as Ptolemy [Pino et al., 1995], LabVIEW [Johnson, 1997], StreamIt [Thies et al., 2002], CAL [Eker and Janneck, 2003], PeaCE [Kwon et al., 2004], Compaan/Laura [Stefanov et al., 2004], and SysteMoc [Haubelt et al., 2007]. Dataflow-oriented DSP design tools typically allow high-level application specification, software simulation, and possibly synthesis for hardware or software implementation [Bhattacharyya et al., 2010].
3.1.1. Synchronous Dataflow
 An SDF graph is characterized by its compile-time predictability through the statically known consumption and production rates, as defined above.Figure 6 shows a simple SDF graph having actors W, X, Y, and Z (shown as circles or vertices of the graph). Each edge (an arrow in the figure connecting a pair of actors) is annotated with the number of tokens produced on it by the source actor and that consumed from it by the sink actor during every invocation of the source and sink actors, respectively. For example, actor X can be fired when there are at least two tokens on its input. Whenever actor X is fired, it consumes two tokens from its input buffer, and produces three tokens onto the output buffer connected to Y and two tokens onto the output buffer connected to Z.
3.1.2. Cyclo-static Dataflow
 Many signal processing applications involve behaviors in which production and consumption rates may change during run-time. In some cases, these changes may, however, be known at compile-time. For example, consider the CSDF graph shown inFigure 1a, which has a decimatoractor M in it. This actor consumes one token from its input on each invocation, but produces a token onto its output only on every fourth invocation. This behavior has been depicted using the varying production volumes denoted by [1 0 0 0]. The numbers of tokens produced by the decimator M follow this cyclic pattern with a period of 4. This sequence of varying production volumes, though not leading to constant output rates like an SDF actor, is still completely deterministic and known at the compile-time. This kind of dataflow behavior, where actors exhibit token production and consumption volumes (in terms of tokens per firing on specific actor ports) that are either constant or expressible as cyclic sequences of constant volumes, is referred to as CSDF. Thus, CSDF can be viewed as a generalization of SDF in which token production and consumption volumes may be different across different firings of an actor, but follow cyclic patterns that are completely specified at the compile-time.
 We refer readers to Bilsen et al.  for more details on the CSDF model. As shown in Figures 1a and 1b, it may be possible to transform a CSDF actor into an SDF actor. In general, when feedback loops are present in a dataflow graph, such a transformation may introduce deadlock, and therefore should be attempted with caution. Such a transformation, when admissible (not leading to deadlock), generally has trade-offs in terms of relevant metrics including latency, throughput, and code size. More detailed comparisons between the SDF and CSDF models of computation are presented inParks et al.  and Bhattacharyya et al. .
3.1.3. Parameterized Dataflow
 Though CSDF provides enhanced expressive power compared to SDF, it is still unable to specify patterns in token consumption and production volumes that are not fully known at compile time. A meta-modeling technique called PDF has been proposed to represent certain kinds of dataflow application dynamics [Bhattacharya and Bhattacharyya, 2000]. This model can be used with any arbitrary dataflow graph format that has a well-defined notion of aschedule iteration.For example, the PDF meta-model, when combined with an underlying SDF model, results in the PSDF (parameterized synchronous dataflow) model. A PSDF graph behaves like an SDF graph during one schedule iteration, but can assume different configurations across different schedule iterations.
 The PDF meta-model supports semantic and syntactic hierarchy. Syntactic hierarchy is used, as in other forms of dataflow, to decompose complex designs in terms of smaller components. On the other hand, semantic hierarchy in PDF is used to apply specific features in the meta-model that are associated with dynamic parameter reconfiguration. A hierarchical actor that encapsulates such semantic hierarchy in PDF is called aPDF subsystem. A PDF subsystem in turn has three underlying graphs called the init, subinit, and bodygraphs, which interact with each other in structured ways. Intuitively, the init and subinit graphs can capture data-dependent, dynamic behavior at certain points during the execution of the graph and configure the body graph to adapt in useful ways to such dynamics. Intuitively, the init graph is designed to capture parameter configuration that is driven by higher, system-level processing, while the subinit graph is designed to capture the parameter changes occurring across different iterations of the corresponding body graph. The init graph can be used to dynamically configure parameters in the subinit graph, which, in general, executes more frequently relative to the init graph.
 To further illustrate the PDF modeling technique, we consider the application example shown in Figure 2a. This example involves an FIR filter with filter taps or coefficients given by CN = [c0, c1, …, cN−1] followed by a decimator with a tunable decimation factor of D. The values of D and CN are set either through a higher level system or user interface. We skip the details of this mechanism for the sake of simplicity and conciseness. Such behavior can be modeled using PDF with an underlying CSDF model. Such a modeling approach is referred to as the parameterized cyclo-static dataflow (PCSDF) model [Saha et al., 2006]. Figure 2b shows one of the possible PCSDF graphs corresponding to the application shown in Figure 2a. The subsystem DF is a PCSDF subsystem with its component graphs as shown in the figure. It can be seen here that the control actor in the DF.init graph of DF subsystem sets the required external and internal parameters, D and CN, respectively. This actor models the required parameter control through either a higher level system or some form of user interface. In this particular case, the DF.subinit graph is empty (in general, the init, subinit and body graphs do not all have to be used for a given subsystem).
 The PCSDF model allows CSDF actors for which the cyclic patterns of token production and consumption volumes can be parameterized in terms of their periods, the actual numbers of tokens consumed or produced in the cyclo-static sequences, or both. Intuitively, for a given configuration of application parameters, a PCSDF graph behaves as a CSDF graph. However, a PCSDF graph not only models all possible parameter configurations in a given application but also describes how they can be changed at run-time.
 Such a model is of particular interest for modeling multirate DSP systems that exhibit parameterizable sample rate conversions. PCSDF allows designers to systematically explore design spaces across static, quasi-static, and dynamic implementation techniques. Here, byquasi-staticimplementation techniques, we mean techniques where relatively large portions of the associated software or hardware structures are fixed at compile-time with minor adjustments allowed at run-time (e.g., in response to changes in input data or operating conditions). A variety of quasi-static dataflow techniques are discussed, for example, inBhattacharyya et al. .
3.2. The Dataflow Interchange Format
 To describe dataflow applications for a wide range of DSP applications, application developers can use the DIF language, which is a standard language founded in dataflow semantics and tailored for DSP system design [Hsu et al., 2005]. DIF provides an integrated set of syntactic and semantic features that help promote high-level modeling, analysis, and optimization of DSP applications and their implementations without over-specification. From a dataflow point of view, DIF is designed to describe mixed-grain graph topologies and hierarchies as well as to specify dataflow-related and actor-specific information. The dataflow semantic specification is based on dataflow modeling theory and independent of any design tool.
 Figure 7 illustrates some of the available constructs in the DIF language along with the syntax used for application specification. More details on the DIF language can be found in Hsu et al. . The topology block of the specification specifies the graph topology, which includes all of the nodes and edges in the graph. DIF supports built-in attributes such as interface, refinement, parameter, and actor, which identify specifications related to graph interfaces, hierarchical subsystems, dataflow parameters, and actor configurations, respectively. DIF also allows user-defined attributes, which have a similar syntax as built-in attributes except that they need to be declared with the attribute keyword.
 The DIF language has been recently augmented with constructs for supporting topological patterns [Sane et al., 2010]. Topological patterns allow concise specification of functional structures at the dataflow graph (inter-actor) level. They can effectively represent many of the flowgraph substructures that are pervasive in the DSP application domain (e.g. chain, ring, butterfly, etc.) to generate compact, scalable application representations. We direct readers toSane et al. [2010, 2011] for more information on the concept of topological patterns and how the DIF supports it.
 To facilitate use of the DIF language, the DIF package (TDP) has been built (see Figure 8). Along with the ability to transform DIF descriptions into manipulable internal representations, TDP contains graph utilities, optimization engines, verification techniques, a comprehensive functional simulation framework, and a software synthesis framework for generating C code [Hsu et al., 2005; Plishker et al., 2008]. These facilities make TDP an effective environment for modeling dataflow applications, providing interoperability with other design environments, and developing and experimenting with new tools and dataflow techniques. Beyond these features, DIF is also suitable as a design environment for implementing dataflow-based application representations. Describing an application graph is done by listing nodes (actors) and edges, and then annotating dataflow specific information as well as other (non-dataflow) kinds of relevant information associated with actors, edges, and design subsystems.
 The framework in DIF for simulation and functional verification of applications, which is based on CFDF semantics, allows application specifications in DIF to be used as executable references for rapid system prototyping and developing further platform-specific implementations. CFDF, which supports dynamic dataflow behaviors, allows flexible and efficient prototyping of dataflow-based application representations, and permits natural description of both dynamic and static dataflow actors. More information on CFDF semantics can be found inPlishker et al. .
3.3. Related Work
 There exist high-end reusable, modular, scalable, and reconfigurable FPGA platforms such as theBerkeley Emulation Engine 2 (BEE2) [Chang et al., 2005], IBOB [Parsons et al., 2006], and UniBoard [Szomoru, 2011], which have been introduced specifically for DSP systems. These have been widely used for radio astronomy applications. The BEE2 uses SDF as a unified computation model for both the microprocessor and the reconfigurable fabric. It uses a high-level block diagram design environment based on The Mathworks' Simulink and the Xilinx System Generator (XSG). This design environment, however, does not expose the underlying dataflow model. In particular, the designer has little or no scope to make use of the underlying dataflow model for experimentation (as mentioned earlier insection 1). Also, the SDF model used for programming the BEE2 is a static dataflow model in that all the dataflow information is available at compile-time (i.e., before executing or running the application). Though this feature provides maximal compile-time predictability, it has limited expressive power. It does not allow for data-dependent, dynamic behavior, which is exhibited by many modern DSP applications, such as the TDD application introduced insection 2 (see Bhattacharyya et al. for more examples of such applications). Other forms of dataflow models that can capture more application dynamics with acceptable levels of compile-time predictability may better exploit the features offered by platforms such as the BEE2. We should, however, mention that the CASPER DSP library offers a software register block that can provide limited parameterization in the design. We have used this block extensively in our TDD design.
 There are some other FPGA design solutions and tool flows available (e.g., those from Nallatech (http://www.nallatech.com) and Lyrtech (http://www.lyrtech.com)). These, however, are commercial tools and do not provide open-source DSP software libraries like the CASPER. Also, CASPER tools support most of the Xilinx FPGA devices unlike these other commercial tools.
 Model based approaches for designing large scale signal processing systems with a focus on radio telescopes have been previously studied [e.g., see Alliot and Deprettere, 2004; Lemaitre and Deprettere, 2006; Lemaitre, 2008]. Several frameworks have been proposed for model based, high-level abstractions of architectures along with performance/cost estimation methods to guide the designer throughout the development cycle [seeAlliot and Deprettere, 2004]. However, the focus of these approaches has been on architecture exploration. There have also been attempts to derive implementation-level specifications starting from system-level specifications by segregating signal processing and control flow (seeLee and Seshia  for more information on control flow) into an application specification and architecture specification, respectively [see Lemaitre and Deprettere, 2006; Lemaitre, 2008]. However, the choice of models of computation has been made primarily from control flow considerations rather than dataflow considerations. These approaches, though relevant, do not specifically address the issue of high-level application specification for platform-independent prototyping and use of models of computation for abstraction of heterogeneous or hybrid dataflow behaviors. This issue is critical to efficient prototyping of high performance signal processing applications, which are typically dataflow dominated, and include increasing levels of dynamic dataflow behavior [e.g., seeBhattacharyya et al., 2010].
 We address this issue using the CFDF model with underlying PSDF or PCSDF behavior and using it for system prototyping. We then show how platform-independent specifications based on this modeling technique can be used to efficiently develop platform-specific implementations.