# Method to Minimize the Clock Skew and Uniform Clock Distribution using Parallel Port in Pipe Line Based Multi Channel DMA Request Terminal for Frequency Measurement

<sup>1</sup> N. Suresh Kumar, <sup>2</sup>D.V. RamaKotiReddy, <sup>3</sup>A.Harish, <sup>4</sup>S.Amarnadh <sup>1,3,4</sup> Dept of IT, GITAM University, Visakhapatnam <sup>2</sup> College of Engineering, Andhra University, Visakhapatnam <sup>1</sup>nskgitam2009@gmail.com

Abstract - This paper presents a new wide-range digital speed measurement method with jitter removal technique and using the direct memory access (DMA) terminal count register (TCR). Our work also supports a multi node interfacing from different measure ends. The multiple measure ends are interfaced with DMA channels through pipelines to improve hit ratio. Here hit ratio indicates the exact identification of encoder pulses without any fail or miss. But the conventional pipeline system is facing problems due to improper synchronization of clock pulses. This is a universal problem in all the digital systems mostly called jitter or skew. Here a new system is implemented in the path of the clock to remove or reduce the clock skew. The jitter is also introduced in the pipeline due to different clock paths to the parallel pipelines. While one pipeline access the encoder pulses the remaining pipelines remain in idle state as single clock pulse is used to fed encoder pulses. And it creates a big challenge if multiple clock pulses are given to individual pipeline systems. This can be overcome using parallel ports as clock signals. The DMA method is based on both pulses counting in the constant sampling time at terminal count stop pin of a DMA controller. The hardware configuration and algorithms for a microcontroller implementation are also presented. The proposed method is suitable in systems using microcontrollers with DMA controller and timers. Limitations and sources of errors are discussed in details. The DMA Terminal count register method is suitable for real-time speed control systems.

## I INTRODUCTION

The speed measurement can be achieved using the following methods

- 1 Time measurement-determines time interval between pulses[1]
- 2 Pulse counting-counts input pulses within sampling time[2]
- 3 Combined method[3]
- 4 Constant Elapsed Time method(CET)[4]
- 5 DMA Transfer method.[4][5]
- 6 Pipeline based Multi channel DMA method[9]

When the hardware configuration of the DMA Transfer method [4][5] executes a long processor instruction, it is possible that DMA acknowledge signal (DACK) is not received before the next rising edge of input pulses. Therefore, this next pulse will not be detected. In order not to lose any input pulse the counter of unperformed DMA requests is used. This can also be solved using a simple h/w configuration and less power consumption method which can be achieved using terminal count register and TC stop pin of DMA controller[9]. The TC stop pin of DMA Controller changes its state after fixed number of DMA cycles. The numbers of pulses generated from oscillator are counted within these DMA cycles. This is proportional to the speed to be measured. For high accurate tracing and to avoid spikes at DRO a two stage pipeline is implemented [9]. A two stage pipeline [8][9] is used for faster data rates. And so it is also limited for some extension only. When compared to conventional speed measurement systems pipeline system performance is very high. But the conventional pipeline system is facing problems due to improper synchronization of clock pulses. This is a universal problem in all the digital systems mostly called jitter or skew. Here a new system is implemented in the path of the clock to remove or reduce the clock skew. The jitter is also introduced in the pipeline due to different clock paths from different measure ends to the parallel pipelines. This can be reduced using by controlling the clock pulses supplied to the parallel pipelines interfaced with the multiple measure ends. There may be a chance of problem in applying clock pulses to parallel pipelines simultaneously. This can be avoided by supplying clock pulses using parallel ports available at microcontroller instead of using external timer or counter. But in the case of microprocessor which does not have internal port architecture

an external port device need to interface to supply pulses to the pipelines simultaneously.

# II TERMINAL COUNT DMA METHOD

In this method the average number of pulses in the buffer is counted which indicates the speed of the disk or the number of pulses arrived at DRQ. The rotational speed can be calculated from the quotient  $\Delta \Phi / \Delta t$ .  $\Delta \Phi$  is the increment of the rotational angle during the time interval  $\Delta t$ ,

Where *m* is the number of encoder marks per turn,  $C_P$  is the number of encoder pulses,  $C_T$  is the number of time clocks measured,  $T_C$  is the clock pulse period,. Using (1) and (2) the rotational speed *n* can be calculated as follows:

$$n = Cp/(Ct. m * T_c) ------(3)$$
  
The minimum measurable speed is  
$$n_{min} = \frac{1}{T_{max}} \cdot m$$

Where  $T_{max}$  is maximum response time.

RELATIVE ERROR

| In        | Pulse<br>countin<br>g | Pulse<br>Time<br>Measuri<br>ng | CET    | TCDM<br>A |
|-----------|-----------------------|--------------------------------|--------|-----------|
| at301pm   | 85%                   | 0.025%                         | 0.025% | 0.01%     |
| at30001pm | 0.85%                 | 2.5%                           | 0.05%  | 0.025%    |

NOT READY SEQUENCE



The 16 bit microprocessor is used to control the DMA controller. The microprocessor is operated at 5MHz clock frequency. The DMA controller is operated at 8MHz. The reference clock for the external free-running timer and the interval timer inside the microcontroller,  $f_{REF}$  is 2 MHz The terminal count DMA method is verified using 20 MHz Frequency generators; a tachometer is interfaced to generate the pulses for DRQ of DMA. Two buffers are used in between DMA and Timer to make delay. The hardware delay is created to enable buffer output data pins before enabling the timer.

The number of auto reload mode is decided by user program. It can also be set using timer reset signal by connecting more logic circuit.

# **III PIPELINE OPERATION**

In the pipeline series a two stage pipeline is integrated to reduce the data loss when compared to conventional systems. The system has cascaded SISO shift registers with the common clock pulse supplied by decoder output, to retain the next data and push the current data to the DMA. As the data enters into the DMA, the vacated bit position in SISO Shift register is filled with the next bit. And a new bit will be inserted into the first bit of pipeline from the encoder. This will improve the data transmission rate and reduces the data loss in wait states and not ready sequences.

A two stage pipeline system is integrated in the circuit to achieve best hit ratio. Here the individual bits from the encoder are fed to registers and they are pushed serially. The bits are temporarily stored in the registers. In not ready sequence, between two DACK signals, and in wait states and in all other states bits are accessed from individual bit position of pipe line registers.



#### Figure 1 multiple pipeline System

Equation (1) defines the clock period for a pipeline system; Where  $D_{max}$  is the largest of maximum propagation delay of all stages in the pipeline [8].

 $T_{clk} > D_{max} + D_R + t_s + \Delta clk \dots (1)$ 

For (1) to be valid, the following condition must be satisfied.  $D_{min} + D_R > t_h + \Delta clk$  ------ (2)

The condition in (2) ensures that new data does not appear at input of a register before its hold time is up.

 $T_{clk}$  = clock period

- $\Delta$  = Constructive clock skew
- $\Delta clk = Clock$  uncertainties
- $t_h$  = Pipe line register and hold time
- $t_s$  = Pipe line registers setup time



Figure 2clock connected to two registers

A single clock pulse is applied to manage the data transmission through the registers in the pipeline. But it will create a clock skew in the pipeline which will decrease the data speed from one stage to other stage. The pulses from the encoder are fed into the first register only when clock pulse is applied to the first stage of the pipeline. The pulse will be passed to the next stage after applying the clock pulse to the next stage. The clock pulse path is directly given to the registers where the encoder pulses are passes from one stage to another stage through flip flops of the registers. This may create a problem of overlapping of pulses in the first stage before it enters into the next stage.

So to avoid this overlapping a delay element is included in the path of the clock pulse. This delay will be equal to the delay created by the pulse passed from one stage to other stage.



Figure 3A delay inserted in clock path to remove clock skew

At the same it is required to consider the clock pulses applied to the individual stages in parallel pipeline system. It is very difficult to control all the clock pulses for all the four pipelines.



Figure 4 Parallel pipeline system

Equation (1) defines the clock period for a pipeline system; Where  $D_{max}$  is the largest of maximum propagation delay and  $D_{min}$  is the minimum propagation delay of all stages in the pipeline [8].

$$T_{clk} > D_{max} - D_{min} + D_R + t_s + \Delta clk \dots (1)$$
  
So,  $D_{max} - D_{min} < T_{clk} - (D_R + t_s + \Delta clk)$ 

So here a method is developed to distribute the clock pulses evenly. All the clock pulses are simultaneously supplied through programmable parallel port. These parallel ports are independently control the clock supply to the individual pipelines. These ports are programmed by defining control word in the microcontroller. The ports are defined as output ports to supply the clock pulses to the pipelines. The width of the clock pulse is decided by the internal software delay branched in between high to low state on the port pin.

# IV HARDWARE CONFIGURATION

But while enabling one pipeline system the remaining pipeline systems remain in idle state. This wastage of time can be reducing by interfacing clock system with parallel ports. A parallel port is used to select the pipeline in sequence and it also used to produce necessary clock pulses for pipelines. When the microprocessor enables the first output port pin then the first pipeline carry the encoder pulses to the DRQ input of the DMA controller. The port outputs are enabled simultaneously to enable pipelines in the same time. So here four port pins are used to supply the clock pulses. And no timer and counter are used.



Figure 5 parallel port clock system for parallel pipelining system

The rising edge of the pulses from the encoder is initially placed in the pipeline. These pulses are read by DRQ pin of the DMA controller. As the DRQ responds to the pulses the terminal count register of the DMA controller which is set by user program is decremented.



Figure 6 DMA method to count pulses from encoder

Before the TC become high the pulses from the free running timer is counted by a buffer. The buffer will count the number of the pulses until the terminal count register value is zero. So the number of pulses in the buffer is counted which indicates the speed of the disk or the number of pulses arrived at DRQ.

### **V SOFTWARE ROUTINES**

Step1: Define four port pins as output ports. Step 2: Send a high and low signals alternatively through parallel port to enable clock input to the pipelines. Step3: mask the channel which is used to access the peripheral Step4: Enable TC stop bit and Auto reload bit of mode set register.

Step5: After the TC pin enables the buffer unmask the  $\frac{10}{10}$  channel.

Step6: Repeat from step3 to step 5 as per the user requirement <sup>11</sup> to take the average values.

The code is developed in assembly level language using  $_{12}$ . MASM software. The code can also be compiling and execute in personal computer at the debug level by giving internal DMA port addresses [6].

# VI CONCLUSIONS

The main problem clock skew in DMA and TC DMA method are reduced using pipelines and introducing delay paths in the clock path to the pipelines. While enabling one pipeline using single clock system the remaining pipeline systems remain in idle state. The jitters due to multiple pipelines are minimized applying clock pulses through parallel by ports simultaneously. The DMA is very easy to interface. The cost effect is less when compared to other methods. It avoids the time constraints in accessing the capture registers. This allows to measure large range of frequencies. The measurement error has a zero average value and almost fixed value over a wide frequency range. Further the execution time can be improved using ARM processor. In pipeline technology a multiple clock pulses are applied to the pipeline stage through a decoder and so there is no timer and counter circuit needed to control the clock pulses at different register levels. This delay path in the clock pulse to the pipeline improves the data transmission rate and reduces the data loses in wait states and not ready sequences of DMA controller and reduces the clock skew. At a time multiple number of pipelines can be accessed using parallel port clock system.

## REFERENCES

- 1. E. P. McCarthy, "A digital instantaneous frequency meter," IEEE *Trans. Znstrum. Meas.*, vol. IM-28, no. 3, pp. 224-226, Sept. 1979.
- T. J. Maloney and F. L. Alvarado, "A digital method for dc motor speed control," IEEE *Trans. Znd. Electr. Con. Znstr.*, vol. IECI-23, no. 1, pp. 44-46, Feb. 1976.
- 3. T. Ohmae, T. Matsuda, K. Kamiyama, and M. Tachikawa, "A microprocessor controlled high-accuracy wide-range speed regulator for motor drives," IEEE *Trans. Znd. Electron.*, vol. IE-29, no. 3, pp. 207-211, 1982.
- Milan Prokin," DMA transfer method for wide range and frequency measurement", IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 42, NO. 4, AUGUST 1993.
- Milan prokin, "Speed measurement using the improved DMA transfer method", IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 38, NO. 6, DECEMBER 1991.
- Intel manuals, July 1990, order Number: 003965-003 by Russian Military, http://doc.chipfind.ru/intel/8257.htm
- RICHARD BONERT, MEMBER, IEEE, "Design of a High Performance Digital Tachometer with a Microcontroller" IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL 38. NO 6. DECEMBER, 1989.
- Suryanarayana B. Tatapudi, *Student Member, IEEE* and José G. Delgado-Frias, *Senior Member, IEEE*, A Mesychronous high performance digital systems, VOL. 53, NO. 5, MAY 2006
- 9. N. SURESH KUMAR, DR. D.V. RAMAKOTIREDDY," PIPE LINE BASED HIGH SPEED MULTI CHANNEL DMA

REQUEST TERMINAL FOR FREQUENCY MEASUREMENT" IJ-ETA-ETS, Volume 3 : Issue 2

- C. Thomas gay, "Timing constraints for wave pipelined systems" IEEE transactions on Computer aided design of integrated circuits, vol13, no.8, august 1994
- 1. Jabulani Nyathi, "A high performance hybrid wave pipelined linear feed back shift register with skew tolerant clocks", IEEE, 1384-1387, 2004
- Mohammad Maymandi, "A digital programmable delay element: Design and analysis", IEEE transaction VLSI systems, Vol.11, no.5, October 2003
- 13. N. Suresh Kumar, Digital frequency meter using DMA terminal count stop method, IJET, Vol.2 (1), 2010, 34-36