for 6.111 Introduction to Digital Systems
ZBT SRAM Interface
The easiest way to implement memory on the labkit, is to use the FPGA's built-in SRAM blocks. However, some designs may require more memory than is available inside the FPGA.
The labkit contains two synchronous, 512k x 36 ZBT SRAM chips (on the back side of the PCB). These are relatively high-speed devices (up to 167MHz), but they can be somewhat tricky to interface to.
ZBT Interface Protocol
The ZBT memories are synchronous: they require a clock input, and their inputs are only sampled on the rising edges of that clock. They are also pipelined, with the data bus being delayed by two cycles after the address and control signals. In other words, if a read cycle is initiated (~CE and ~OE are low, ~WE is high, and the address is valid) on clock cycle n, then the data read from that address will be available on cycle n+2. Similarly, for a write cycle, the data to be written is supplied on cycle n+2.
The ZBT memories have the following control signals:
The figure below illustrates normal read and write transactions (e.g., not using burst mode) for the ZBT memories. The same clock is used by both the FPGA logic and the RAMs. (More about that in the next section.)
ZBT memory access protocol
Memory Clock Generation
Inside the FPGA, clocks are distributed using dedicated clock trees, which ensure that the clock signals reach every flip-flop relatively simulataneously. If the clock inputs of the ZBT memories are driven by outputs of the FPGA, then the clock signal at the memories will be delayed by the sum of the propagation delay through the FPGA output pins and the propagation delay of the PCB trace.
To correct this skew at the memory devices, we need to drive the ZBT clock inputs with a phase-shifted version of the clock, so that the rising clock edge reaches the memory devices at the same time that it reaches all the registers in the FPGA. To generate this phase-shifted clock, a delay-locked loop (DLL) is used. DLLs are fundamentally analog components. There is no way to infer a DLL using Verilog code, so they must be instantiated. The Xilinx libarary component containing a DLL is the digital clock manager, or DCM.
The following is a high-level and incomplete description of the operation of DCMs. For more information, see the Libraries Guide volume of the Xilinx documentation. Essentially, a DCM takes a reference clock input signal on its CLKIN port, and outputs a delayed copy of that clock on its CLK0 output port. The CLK0 output is generally used to drive a clock distribution tree (a BUFG primitive in the Xilinx library). One output of the clock distribution tree should be used to drive the feedback input (CLKFPB) of the DCM. The delay between the CLKIN and CLK0 ports on the DCM is automatically adjusted by a feedback loop until the CLKIN and CLKFB inputs are in phase. Once the phase difference between CLKIN and CLKFB has been minimized, the DCM is said to be "locked", and outputs of the clock distribution tree should be exactly in phase with the input clock signal. Effectively, the propagation delay through the clock distribution tree has been canceled, because the total propagation delay from the CLKIN pin of the DCM to the output of the clock distribution tree is exactly one clock period.
The schematic below shows how two DCMs can be used to ensure that the ZBT memories are clocked at exactly the same time as the FPGA flip-flops.
ZBT memory clock de-skew circuitry
In the circuitry above, the lower DCM is used to ensure that the
The 16-bit shift register (SRL16) is used to ensure the DCMs lock properly when the FPGA finishes its configuration process. During configuration, the FPGA's I/O pins are all held in tristate. The DCMs are released from reset a few clock cycles before the global tristate signal is released. The upper DCM therefore attempts to lock without a feedback input. It is possible for the DCM to get stuck in a funny state, and never properly lock. To prevent this, a shift register is used to trigger a reset of both DCMs shortly after the entire FPGA configuration proccess is complete and the I/O pins have been enabled.
The LOCKED output of a DCM signals that the feedback loop on the DCM's internal DLL has stabilized. The
Verilog code implementing this circuitry is available here. Note that, in order to minimize routing delay on the clock signals, this code utilizes input clock buffers (IBUFG) for both the reference clock and feedback inputs. It also utilizes I/O flip-flops (OFDDRRSE) for the RAM and feedback clock outputs. These special primitives are actually part of the I/O driver circuitry on the FPGA. This means that you will not be able to use these inputs or outputs anywhere else in your code, because the IBUFG and OFDDRRSE gates place these signals outside of the range of the general routing mesh.
MIT 6.111 Introduction to Digital Systems, Updated April 26, 2007